* [PATCH v3] arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
@ 2025-05-08 21:00 D Scott Phillips
2025-05-09 10:08 ` Oliver Upton
0 siblings, 1 reply; 3+ messages in thread
From: D Scott Phillips @ 2025-05-08 21:00 UTC (permalink / raw)
To: Catalin Marinas, James Clark, James Morse, Joey Gouly,
Kevin Brodsky, Marc Zyngier, Mark Brown, Mark Rutland,
Oliver Upton, Rob Herring (Arm), Shameer Kolothum, Shiqi Liu,
Will Deacon, Yicong Yang, kvmarm, linux-arm-kernel, linux-kernel
On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
translations for data addresses initiated by load/store instructions.
Only instruction initiated translations are vulnerable, not translations
from prefetches for example. A DSB before the store to HCR_EL2 is
sufficient to prevent older instructions from hitting the window for
corruption, and an ISB after is sufficient to prevent younger
instructions from hitting the window for corruption.
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
---
v2: https://lore.kernel.org/kvmarm/20250425024741.1309893-1-scott@os.amperecomputing.com/
Changes since v2:
- Apply the workaround before alternatives are applied (Marc)
- Also catch stores to HCR_EL2 in assembly files (Marc)
- Added a sysreg_clear_set_hcr() helper for the HCR_EL2 fiddling in
vgic-v3-sr.c that I had previously missed.
v1: https://lore.kernel.org/kvmarm/20250415154711.1698544-2-scott@os.amperecomputing.com/
Changes since v1:
- Add a write_sysreg_hcr() helper (Oliver)
- Add more specific wording in the errata description (Oliver & Marc)
arch/arm64/Kconfig | 17 ++++++++++++++++
arch/arm64/include/asm/el2_setup.h | 2 +-
arch/arm64/include/asm/hardirq.h | 4 ++--
arch/arm64/include/asm/sysreg.h | 27 +++++++++++++++++++++++++
arch/arm64/kernel/cpu_errata.c | 14 +++++++++++++
arch/arm64/kernel/hyp-stub.S | 2 +-
arch/arm64/kvm/at.c | 8 ++++----
arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
arch/arm64/kvm/hyp/nvhe/host.S | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ++--
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 2 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
arch/arm64/kvm/hyp/vgic-v3-sr.c | 4 ++--
arch/arm64/kvm/hyp/vhe/switch.c | 2 +-
arch/arm64/kvm/hyp/vhe/tlb.c | 4 ++--
arch/arm64/tools/cpucaps | 1 +
16 files changed, 78 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a182295e6f08b..3ae4e80e3002b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -464,6 +464,23 @@ config AMPERE_ERRATUM_AC03_CPU_38
If unsure, say Y.
+config AMPERE_ERRATUM_AC04_CPU_23
+ bool "AmpereOne: AC04_CPU_23: Failure to synchronize writes to HCR_EL2 may corrupt address translations."
+ default y
+ help
+ This option adds an alternative code sequence to work around Ampere
+ errata AC04_CPU_23 on AmpereOne.
+
+ Updates to HCR_EL2 can rarely corrupt simultaneous translations for
+ data addresses initiated by load/store instructions. Only
+ instruction initiated translations are vulnerable, not translations
+ from prefetches for example. A DSB before the store to HCR_EL2 is
+ sufficient to prevent older instructions from hitting the window
+ for corruption, and an ISB after is sufficient to prevent younger
+ instructions from hitting the window for corruption.
+
+ If unsure, say Y.
+
config ARM64_WORKAROUND_CLEAN_CACHE
bool
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index ebceaae3c749b..2500fd0a1f66a 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -38,7 +38,7 @@
orr x0, x0, #HCR_E2H
.LnVHE_\@:
- msr hcr_el2, x0
+ msr_hcr_el2 x0
isb
.endm
diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index cbfa7b6f2e098..77d6b8c63d4e6 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -41,7 +41,7 @@ do { \
\
___hcr = read_sysreg(hcr_el2); \
if (!(___hcr & HCR_TGE)) { \
- write_sysreg(___hcr | HCR_TGE, hcr_el2); \
+ write_sysreg_hcr(___hcr | HCR_TGE); \
isb(); \
} \
/* \
@@ -82,7 +82,7 @@ do { \
*/ \
barrier(); \
if (!___ctx->cnt && !(___hcr & HCR_TGE)) \
- write_sysreg(___hcr, hcr_el2); \
+ write_sysreg_hcr(___hcr); \
} while (0)
static inline void ack_bad_irq(unsigned int irq)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 2639d3633073d..7284828f0dc9e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1091,6 +1091,15 @@
__emit_inst(0xd5000000|(\sreg)|(.L__gpr_num_\rt))
.endm
+ .macro msr_hcr_el2, reg
+#if IS_ENABLED(CONFIG_AMPERE_ERRATUM_AC04_CPU_23)
+ dsb nsh
+ msr hcr_el2, \reg
+ isb
+#else
+ msr hcr_el2, \reg
+#endif
+ .endm
#else
#include <linux/bitfield.h>
@@ -1178,6 +1187,13 @@
write_sysreg(__scs_new, sysreg); \
} while (0)
+#define sysreg_clear_set_hcr(clear, set) do { \
+ u64 __scs_val = read_sysreg(hcr_el2); \
+ u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set); \
+ if (__scs_new != __scs_val) \
+ write_sysreg_hcr(__scs_new); \
+} while (0)
+
#define sysreg_clear_set_s(sysreg, clear, set) do { \
u64 __scs_val = read_sysreg_s(sysreg); \
u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set); \
@@ -1185,6 +1201,17 @@
write_sysreg_s(__scs_new, sysreg); \
} while (0)
+#define write_sysreg_hcr(__val) do { \
+ if (IS_ENABLED(CONFIG_AMPERE_ERRATUM_AC04_CPU_23) && \
+ (!system_capabilities_finalized() || \
+ alternative_has_cap_unlikely(ARM64_WORKAROUND_AMPERE_AC04_CPU_23))) \
+ asm volatile("dsb nsh; msr hcr_el2, %x0; isb" \
+ : : "rZ" (__val)); \
+ else \
+ asm volatile("msr hcr_el2, %x0" \
+ : : "rZ" (__val)); \
+} while (0)
+
#define read_sysreg_par() ({ \
u64 par; \
asm(ALTERNATIVE("nop", "dmb sy", ARM64_WORKAROUND_1508412)); \
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 6b0ad5070d3e0..59d723c9ab8f5 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -557,6 +557,13 @@ static const struct midr_range erratum_ac03_cpu_38_list[] = {
};
#endif
+#ifdef CONFIG_AMPERE_ERRATUM_AC04_CPU_23
+static const struct midr_range erratum_ac04_cpu_23_list[] = {
+ MIDR_ALL_VERSIONS(MIDR_AMPERE1A),
+ {},
+};
+#endif
+
const struct arm64_cpu_capabilities arm64_errata[] = {
#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
{
@@ -875,6 +882,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.capability = ARM64_WORKAROUND_AMPERE_AC03_CPU_38,
ERRATA_MIDR_RANGE_LIST(erratum_ac03_cpu_38_list),
},
+#endif
+#ifdef CONFIG_AMPERE_ERRATUM_AC04_CPU_23
+ {
+ .desc = "AmpereOne erratum AC04_CPU_23",
+ .capability = ARM64_WORKAROUND_AMPERE_AC04_CPU_23,
+ ERRATA_MIDR_RANGE_LIST(erratum_ac04_cpu_23_list),
+ },
#endif
{
.desc = "Broken CNTVOFF_EL2",
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index ae990da1eae5a..36e2d26b54f5c 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -97,7 +97,7 @@ SYM_CODE_START_LOCAL(__finalise_el2)
2:
// Engage the VHE magic!
mov_q x0, HCR_HOST_VHE_FLAGS
- msr hcr_el2, x0
+ msr_hcr_el2 x0
isb
// Use the EL1 allocated stack, per-cpu offset
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index f74a66ce3064b..9c13e70fadf5e 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -516,7 +516,7 @@ static void __mmu_config_save(struct mmu_config *config)
static void __mmu_config_restore(struct mmu_config *config)
{
- write_sysreg(config->hcr, hcr_el2);
+ write_sysreg_hcr(config->hcr);
/*
* ARM errata 1165522 and 1530923 require TGE to be 1 before
@@ -1267,7 +1267,7 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
skip_mmu_switch:
/* Clear TGE, enable S2 translation, we're rolling */
- write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2);
+ write_sysreg_hcr((config.hcr & ~HCR_TGE) | HCR_VM);
isb();
switch (op) {
@@ -1350,7 +1350,7 @@ void __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
if (!vcpu_el2_e2h_is_set(vcpu))
val |= HCR_NV | HCR_NV1;
- write_sysreg(val, hcr_el2);
+ write_sysreg_hcr(val);
isb();
par = SYS_PAR_EL1_F;
@@ -1375,7 +1375,7 @@ void __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
if (!fail)
par = read_sysreg_par();
- write_sysreg(hcr, hcr_el2);
+ write_sysreg_hcr(hcr);
isb();
}
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index b741ea6aefa58..06aa37dbc957d 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -301,7 +301,7 @@ static inline void ___activate_traps(struct kvm_vcpu *vcpu, u64 hcr)
if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM))
hcr |= HCR_TVM;
- write_sysreg(hcr, hcr_el2);
+ write_sysreg_hcr(hcr);
if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE))
write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index 58f0cb2298cc2..eef15b374abb0 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -124,7 +124,7 @@ SYM_FUNC_START(__hyp_do_panic)
/* Ensure host stage-2 is disabled */
mrs x0, hcr_el2
bic x0, x0, #HCR_VM
- msr hcr_el2, x0
+ msr_hcr_el2 x0
isb
tlbi vmalls12e1
dsb nsh
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index f8af11189572f..aada42522e7be 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -100,7 +100,7 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init)
msr mair_el2, x1
ldr x1, [x0, #NVHE_INIT_HCR_EL2]
- msr hcr_el2, x1
+ msr_hcr_el2 x1
mov x2, #HCR_E2H
and x2, x1, x2
@@ -262,7 +262,7 @@ reset:
alternative_if ARM64_KVM_PROTECTED_MODE
mov_q x5, HCR_HOST_NVHE_FLAGS
- msr hcr_el2, x5
+ msr_hcr_el2 x5
alternative_else_nop_endif
/* Install stub vectors */
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 31173c6946951..d1488d4e51413 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -309,7 +309,7 @@ int __pkvm_prot_finalize(void)
*/
kvm_flush_dcache_to_poc(params, sizeof(*params));
- write_sysreg(params->hcr_el2, hcr_el2);
+ write_sysreg_hcr(params->hcr_el2);
__load_stage2(&host_mmu.arch.mmu, &host_mmu.arch);
/*
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 7d2ba6ef02618..4024fafbe3594 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -142,7 +142,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
__deactivate_traps_common(vcpu);
- write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2);
+ write_sysreg_hcr(this_cpu_ptr(&kvm_init_params)->hcr_el2);
__deactivate_cptr_traps(vcpu);
write_sysreg(__kvm_hyp_host_vector, vbar_el2);
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 50aa8dbcae75b..f8a91780e49a9 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -446,7 +446,7 @@ u64 __vgic_v3_get_gic_config(void)
if (has_vhe()) {
flags = local_daif_save();
} else {
- sysreg_clear_set(hcr_el2, 0, HCR_AMO | HCR_FMO | HCR_IMO);
+ sysreg_clear_set_hcr(0, HCR_AMO | HCR_FMO | HCR_IMO);
isb();
}
@@ -461,7 +461,7 @@ u64 __vgic_v3_get_gic_config(void)
if (has_vhe()) {
local_daif_restore(flags);
} else {
- sysreg_clear_set(hcr_el2, HCR_AMO | HCR_FMO | HCR_IMO, 0);
+ sysreg_clear_set_hcr(HCR_AMO | HCR_FMO | HCR_IMO, 0);
isb();
}
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 731a0378ed132..faacdfb328af6 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -184,7 +184,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
___deactivate_traps(vcpu);
- write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+ write_sysreg_hcr(HCR_HOST_VHE_FLAGS);
if (has_cntpoff()) {
struct timer_map map;
diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
index 3d50a1bd2bdbc..ec25698186297 100644
--- a/arch/arm64/kvm/hyp/vhe/tlb.c
+++ b/arch/arm64/kvm/hyp/vhe/tlb.c
@@ -63,7 +63,7 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
__load_stage2(mmu, mmu->arch);
val = read_sysreg(hcr_el2);
val &= ~HCR_TGE;
- write_sysreg(val, hcr_el2);
+ write_sysreg_hcr(val);
isb();
}
@@ -73,7 +73,7 @@ static void exit_vmid_context(struct tlb_inv_context *cxt)
* We're done with the TLB operation, let's restore the host's
* view of HCR_EL2.
*/
- write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+ write_sysreg_hcr(HCR_HOST_VHE_FLAGS);
isb();
/* ... and the stage-2 MMU context that we switched away from */
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 772c1b008e437..72f10b74ce807 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -94,6 +94,7 @@ WORKAROUND_2457168
WORKAROUND_2645198
WORKAROUND_2658417
WORKAROUND_AMPERE_AC03_CPU_38
+WORKAROUND_AMPERE_AC04_CPU_23
WORKAROUND_TRBE_OVERWRITE_FILL_MODE
WORKAROUND_TSB_FLUSH_FAILURE
WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
--
2.49.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3] arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
2025-05-08 21:00 [PATCH v3] arm64: errata: Work around AmpereOne's erratum AC04_CPU_23 D Scott Phillips
@ 2025-05-09 10:08 ` Oliver Upton
2025-05-13 18:42 ` D Scott Phillips
0 siblings, 1 reply; 3+ messages in thread
From: Oliver Upton @ 2025-05-09 10:08 UTC (permalink / raw)
To: D Scott Phillips
Cc: Catalin Marinas, James Clark, James Morse, Joey Gouly,
Kevin Brodsky, Marc Zyngier, Mark Brown, Mark Rutland,
Rob Herring (Arm), Shameer Kolothum, Shiqi Liu, Will Deacon,
Yicong Yang, kvmarm, linux-arm-kernel, linux-kernel
Hey D Scott,
On Thu, May 08, 2025 at 02:00:09PM -0700, D Scott Phillips wrote:
> On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
> translations for data addresses initiated by load/store instructions.
> Only instruction initiated translations are vulnerable, not translations
> from prefetches for example. A DSB before the store to HCR_EL2 is
> sufficient to prevent older instructions from hitting the window for
> corruption, and an ISB after is sufficient to prevent younger
> instructions from hitting the window for corruption.
>
> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Overall looks good, still needs an entry in Documentation/arch/arm64/silicon-errata.rst
which Marc noted in v2.
With that addressed:
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Thanks,
Oliver
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3] arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
2025-05-09 10:08 ` Oliver Upton
@ 2025-05-13 18:42 ` D Scott Phillips
0 siblings, 0 replies; 3+ messages in thread
From: D Scott Phillips @ 2025-05-13 18:42 UTC (permalink / raw)
To: Oliver Upton
Cc: Catalin Marinas, James Clark, James Morse, Joey Gouly,
Kevin Brodsky, Marc Zyngier, Mark Brown, Mark Rutland,
Rob Herring (Arm), Shameer Kolothum, Shiqi Liu, Will Deacon,
Yicong Yang, kvmarm, linux-arm-kernel, linux-kernel
Oliver Upton <oliver.upton@linux.dev> writes:
> Hey D Scott,
>
> On Thu, May 08, 2025 at 02:00:09PM -0700, D Scott Phillips wrote:
>> On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
>> translations for data addresses initiated by load/store instructions.
>> Only instruction initiated translations are vulnerable, not translations
>> from prefetches for example. A DSB before the store to HCR_EL2 is
>> sufficient to prevent older instructions from hitting the window for
>> corruption, and an ISB after is sufficient to prevent younger
>> instructions from hitting the window for corruption.
>>
>> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
>
> Overall looks good, still needs an entry in Documentation/arch/arm64/silicon-errata.rst
> which Marc noted in v2.
Ah, sorry for missing that and making you repeat yourself. I'll fix
that.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-13 19:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-08 21:00 [PATCH v3] arm64: errata: Work around AmpereOne's erratum AC04_CPU_23 D Scott Phillips
2025-05-09 10:08 ` Oliver Upton
2025-05-13 18:42 ` D Scott Phillips
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).