* [PATCH v2] arm64: errata: Handle Apple WFI State Loss
@ 2026-06-15 12:21 Yureka Lilian
2026-06-15 12:59 ` Nick Chan
2026-06-15 15:02 ` Will Deacon
0 siblings, 2 replies; 8+ messages in thread
From: Yureka Lilian @ 2026-06-15 12:21 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, asahi, Sasha Finkelstein,
Yureka Lilian
Apple Silicon CPUs can lose register state in WFI, leading to crashes
in the idle loop early in the boot process.
This applies to any previous Apple Silicon CPUs too, but is worked
around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
during m1n1's chickens setup.
This workaround no longer exists since M4.
Add a workaround capability for replacing wfi and wfit with nop, and
an erratum to enable it on the affected CPUs if the workaround using the
sysreg is not already applied. Leave the decision whether the sysreg
workaround can be used up to the earlier parts of the boot chain which
already configure the Apple Silicon chicken bits.
This alternative has to be applied in early boot, since otherwise some
cores might enter the idle loop before apply_alternatives_all() is run.
Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
---
Changes since v1:
Restricted the erratum to EL2 only, since in EL1 we'd expect the
hypervisor to trap WFI and handle the erratum.
Tested on M4 and M4 Pro (which now sometimes nondeterministically
crash later during boot).
Successfully booted on M3 Max with the SYS_IMP_APL_CYC_OVRD
workaround disabled in the bootloader, as well as A18 Pro (which,
like M4 / M4 Pro, doesn't have SYS_IMP_APL_CYC_OVRD).
There is probably a better place for the SYS_IMP_APL_CYC_OVRD
defines, which I currently put in the middle of cpu_errata.c, but I
wouldn't know where.
---
arch/arm64/Kconfig | 12 ++++++++++++
arch/arm64/include/asm/barrier.h | 19 ++++++++++++++++---
arch/arm64/kernel/cpu_errata.c | 21 +++++++++++++++++++++
arch/arm64/tools/cpucaps | 1 +
4 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b3afe0688919..8c8ff069856f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -453,6 +453,18 @@ config AMPERE_ERRATUM_AC04_CPU_23
If unsure, say Y.
+config APPLE_ERRATUM_WFI_STATE
+ bool "Apple Silicon: WFI loses state"
+ default y
+ help
+ This option adds an alternative code sequence to work around some
+ Apple Silicon CPUs losing register state during wfi and wfit
+ instructions.
+
+ As a workaround, the wfi and wfit instructions are replaced with nop
+ operations via the alternative framework if an affected CPU is
+ detected.
+
config ARM64_WORKAROUND_CLEAN_CACHE
bool
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 9495c4441a46..f72eddc7c434 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -20,9 +20,22 @@
#define wfe() asm volatile("wfe" : : : "memory")
#define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
: : "r" (val) : "memory")
-#define wfi() asm volatile("wfi" : : : "memory")
-#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
- : : "r" (val) : "memory")
+#define wfi() \
+ do { \
+ asm volatile( \
+ ALTERNATIVE("wfi", \
+ "nop", \
+ ARM64_WORKAROUND_WFI_STATE) \
+ : : : "memory"); \
+ } while (0)
+#define wfit(val) \
+ do { \
+ asm volatile( \
+ ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
+ "nop", \
+ ARM64_WORKAROUND_WFI_STATE) \
+ : : "r" (val) : "memory"); \
+ } while (0)
#define isb() asm volatile("isb" : : : "memory")
#define dmb(opt) asm volatile("dmb " #opt : : : "memory")
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1995e1198648..8c9a194eddc4 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -309,6 +309,19 @@ static void cpu_enable_impdef_pmuv3_traps(const struct arm64_cpu_capabilities *_
sysreg_clear_set_s(SYS_HACR_EL2, 0, BIT(56));
}
+#ifdef CONFIG_APPLE_ERRATUM_WFI_STATE
+static bool has_apple_erratum_wfi_state(const struct arm64_cpu_capabilities *entry, int scope)
+{
+#define SYS_IMP_APL_CYC_OVRD sys_reg(3, 5, 15, 5, 0)
+#define CYC_OVRD_WFI_MODE_MASK GENMASK(26, 24)
+ if (read_cpuid_implementor() != ARM_CPU_IMP_APPLE)
+ return false;
+ if ((read_sysreg(CurrentEL) >> 2) != 2)
+ return false;
+ return FIELD_GET(CYC_OVRD_WFI_MODE_MASK, read_sysreg_s(SYS_IMP_APL_CYC_OVRD)) != 2;
+}
+#endif
+
#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
@@ -1009,6 +1022,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.matches = has_impdef_pmuv3,
.cpu_enable = cpu_enable_impdef_pmuv3_traps,
},
+#ifdef CONFIG_APPLE_ERRATUM_WFI_STATE
+ {
+ .desc = "Apple WFI loses state",
+ .capability = ARM64_WORKAROUND_WFI_STATE,
+ .type = ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+ .matches = has_apple_erratum_wfi_state,
+ },
+#endif
{
}
};
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 9b85a84f6fd4..bbf8c15d79b0 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -128,3 +128,4 @@ WORKAROUND_REPEAT_TLBI
WORKAROUND_SPECULATIVE_AT
WORKAROUND_SPECULATIVE_SSBS
WORKAROUND_SPECULATIVE_UNPRIV_LOAD
+WORKAROUND_WFI_STATE
---
base-commit: c425609d6ac4012c8bbf01ec2e10e801b1923a7b
change-id: 20260614-wfi-erratum-7a9f305f601f
Best regards,
--
Yureka Lilian <yureka@cyberchaos.dev>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-15 12:21 [PATCH v2] arm64: errata: Handle Apple WFI State Loss Yureka Lilian
@ 2026-06-15 12:59 ` Nick Chan
2026-06-15 15:02 ` Will Deacon
1 sibling, 0 replies; 8+ messages in thread
From: Nick Chan @ 2026-06-15 12:59 UTC (permalink / raw)
To: Yureka Lilian, Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, linux-kernel, asahi, Sasha Finkelstein
Yureka Lilian 於 2026/6/15 晚上8:21 寫道:
> Apple Silicon CPUs can lose register state in WFI, leading to crashes
> in the idle loop early in the boot process.
> This applies to any previous Apple Silicon CPUs too, but is worked
> around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
> during m1n1's chickens setup.
> This workaround no longer exists since M4.
>
> Add a workaround capability for replacing wfi and wfit with nop, and
> an erratum to enable it on the affected CPUs if the workaround using the
> sysreg is not already applied. Leave the decision whether the sysreg
> workaround can be used up to the earlier parts of the boot chain which
> already configure the Apple Silicon chicken bits.
>
> This alternative has to be applied in early boot, since otherwise some
> cores might enter the idle loop before apply_alternatives_all() is run.
>
> Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
> Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
> ---
> Changes since v1:
> Restricted the erratum to EL2 only, since in EL1 we'd expect the
> hypervisor to trap WFI and handle the erratum.
>
> Tested on M4 and M4 Pro (which now sometimes nondeterministically
> crash later during boot).
> Successfully booted on M3 Max with the SYS_IMP_APL_CYC_OVRD
> workaround disabled in the bootloader, as well as A18 Pro (which,
> like M4 / M4 Pro, doesn't have SYS_IMP_APL_CYC_OVRD).
>
> There is probably a better place for the SYS_IMP_APL_CYC_OVRD
> defines, which I currently put in the middle of cpu_errata.c, but I
> wouldn't know where.
> ---
> arch/arm64/Kconfig | 12 ++++++++++++
> arch/arm64/include/asm/barrier.h | 19 ++++++++++++++++---
> arch/arm64/kernel/cpu_errata.c | 21 +++++++++++++++++++++
> arch/arm64/tools/cpucaps | 1 +
> 4 files changed, 50 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b3afe0688919..8c8ff069856f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -453,6 +453,18 @@ config AMPERE_ERRATUM_AC04_CPU_23
>
> If unsure, say Y.
>
> +config APPLE_ERRATUM_WFI_STATE
> + bool "Apple Silicon: WFI loses state"
> + default y
> + help
> + This option adds an alternative code sequence to work around some
> + Apple Silicon CPUs losing register state during wfi and wfit
> + instructions.
> +
> + As a workaround, the wfi and wfit instructions are replaced with nop
> + operations via the alternative framework if an affected CPU is
> + detected.
> +
> config ARM64_WORKAROUND_CLEAN_CACHE
> bool
>
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 9495c4441a46..f72eddc7c434 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -20,9 +20,22 @@
> #define wfe() asm volatile("wfe" : : : "memory")
> #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
> : : "r" (val) : "memory")
> -#define wfi() asm volatile("wfi" : : : "memory")
> -#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
> - : : "r" (val) : "memory")
> +#define wfi() \
> + do { \
> + asm volatile( \
> + ALTERNATIVE("wfi", \
> + "nop", \
> + ARM64_WORKAROUND_WFI_STATE) \
> + : : : "memory"); \
> + } while (0)
> +#define wfit(val) \
> + do { \
> + asm volatile( \
> + ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
> + "nop", \
> + ARM64_WORKAROUND_WFI_STATE) \
> + : : "r" (val) : "memory"); \
> + } while (0)
>
> #define isb() asm volatile("isb" : : : "memory")
> #define dmb(opt) asm volatile("dmb " #opt : : : "memory")
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 1995e1198648..8c9a194eddc4 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -309,6 +309,19 @@ static void cpu_enable_impdef_pmuv3_traps(const struct arm64_cpu_capabilities *_
> sysreg_clear_set_s(SYS_HACR_EL2, 0, BIT(56));
> }
>
> +#ifdef CONFIG_APPLE_ERRATUM_WFI_STATE
> +static bool has_apple_erratum_wfi_state(const struct arm64_cpu_capabilities *entry, int scope)
> +{
> +#define SYS_IMP_APL_CYC_OVRD sys_reg(3, 5, 15, 5, 0)
> +#define CYC_OVRD_WFI_MODE_MASK GENMASK(26, 24)
> + if (read_cpuid_implementor() != ARM_CPU_IMP_APPLE)
> + return false;
> + if ((read_sysreg(CurrentEL) >> 2) != 2)
> + return false;
Nested vitrualization exists, and may be supported by KVM or macOS HVF.
Additionally, presumably the workaround should be applied under m1n1 hypervisor.
The solution is less clear. A reliable way to detect bare metal or
"almost bare metal" is checking for "apple,arm-platform" using
of_machine_is_compatible(), but that involves performing a non-CPU check inside a
CPU errata function, so that do not feel right to me.
Best Regards,
Nick Chan
> + return FIELD_GET(CYC_OVRD_WFI_MODE_MASK, read_sysreg_s(SYS_IMP_APL_CYC_OVRD)) != 2;
> +}
> +#endif
> +
> #ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
> static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
> #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
> @@ -1009,6 +1022,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> .matches = has_impdef_pmuv3,
> .cpu_enable = cpu_enable_impdef_pmuv3_traps,
> },
> +#ifdef CONFIG_APPLE_ERRATUM_WFI_STATE
> + {
> + .desc = "Apple WFI loses state",
> + .capability = ARM64_WORKAROUND_WFI_STATE,
> + .type = ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
> + .matches = has_apple_erratum_wfi_state,
> + },
> +#endif
> {
> }
> };
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 9b85a84f6fd4..bbf8c15d79b0 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -128,3 +128,4 @@ WORKAROUND_REPEAT_TLBI
> WORKAROUND_SPECULATIVE_AT
> WORKAROUND_SPECULATIVE_SSBS
> WORKAROUND_SPECULATIVE_UNPRIV_LOAD
> +WORKAROUND_WFI_STATE
>
> ---
> base-commit: c425609d6ac4012c8bbf01ec2e10e801b1923a7b
> change-id: 20260614-wfi-erratum-7a9f305f601f
>
> Best regards,
> --
> Yureka Lilian <yureka@cyberchaos.dev>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-15 12:21 [PATCH v2] arm64: errata: Handle Apple WFI State Loss Yureka Lilian
2026-06-15 12:59 ` Nick Chan
@ 2026-06-15 15:02 ` Will Deacon
2026-06-15 15:27 ` Sven Peter
2026-06-17 19:23 ` Yureka Lilian
1 sibling, 2 replies; 8+ messages in thread
From: Will Deacon @ 2026-06-15 15:02 UTC (permalink / raw)
To: Yureka Lilian
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, asahi,
Sasha Finkelstein
On Mon, Jun 15, 2026 at 02:21:36PM +0200, Yureka Lilian wrote:
> Apple Silicon CPUs can lose register state in WFI, leading to crashes
> in the idle loop early in the boot process.
> This applies to any previous Apple Silicon CPUs too, but is worked
> around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
> during m1n1's chickens setup.
> This workaround no longer exists since M4.
>
> Add a workaround capability for replacing wfi and wfit with nop, and
> an erratum to enable it on the affected CPUs if the workaround using the
> sysreg is not already applied. Leave the decision whether the sysreg
> workaround can be used up to the earlier parts of the boot chain which
> already configure the Apple Silicon chicken bits.
>
> This alternative has to be applied in early boot, since otherwise some
> cores might enter the idle loop before apply_alternatives_all() is run.
>
> Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
> Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
> ---
> Changes since v1:
> Restricted the erratum to EL2 only, since in EL1 we'd expect the
> hypervisor to trap WFI and handle the erratum.
>
> Tested on M4 and M4 Pro (which now sometimes nondeterministically
> crash later during boot).
> Successfully booted on M3 Max with the SYS_IMP_APL_CYC_OVRD
> workaround disabled in the bootloader, as well as A18 Pro (which,
> like M4 / M4 Pro, doesn't have SYS_IMP_APL_CYC_OVRD).
>
> There is probably a better place for the SYS_IMP_APL_CYC_OVRD
> defines, which I currently put in the middle of cpu_errata.c, but I
> wouldn't know where.
> ---
> arch/arm64/Kconfig | 12 ++++++++++++
> arch/arm64/include/asm/barrier.h | 19 ++++++++++++++++---
> arch/arm64/kernel/cpu_errata.c | 21 +++++++++++++++++++++
> arch/arm64/tools/cpucaps | 1 +
> 4 files changed, 50 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b3afe0688919..8c8ff069856f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -453,6 +453,18 @@ config AMPERE_ERRATUM_AC04_CPU_23
>
> If unsure, say Y.
>
> +config APPLE_ERRATUM_WFI_STATE
> + bool "Apple Silicon: WFI loses state"
> + default y
> + help
> + This option adds an alternative code sequence to work around some
> + Apple Silicon CPUs losing register state during wfi and wfit
> + instructions.
> +
> + As a workaround, the wfi and wfit instructions are replaced with nop
> + operations via the alternative framework if an affected CPU is
> + detected.
> +
> config ARM64_WORKAROUND_CLEAN_CACHE
> bool
>
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 9495c4441a46..f72eddc7c434 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -20,9 +20,22 @@
> #define wfe() asm volatile("wfe" : : : "memory")
> #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
> : : "r" (val) : "memory")
> -#define wfi() asm volatile("wfi" : : : "memory")
> -#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
> - : : "r" (val) : "memory")
> +#define wfi() \
> + do { \
> + asm volatile( \
> + ALTERNATIVE("wfi", \
> + "nop", \
> + ARM64_WORKAROUND_WFI_STATE) \
> + : : : "memory"); \
> + } while (0)
> +#define wfit(val) \
> + do { \
> + asm volatile( \
> + ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
> + "nop", \
> + ARM64_WORKAROUND_WFI_STATE) \
> + : : "r" (val) : "memory"); \
> + } while (0)
How can you guarantee that we don't run one of these prior to patching?
I wonder if we're better off doing something like x86 and having an "idle="
cmdline option. which could choose between e.g. nop, wfe, wfi and yield, for
example? In fact, would wfe be a better choice than nop for you?
Will
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-15 15:02 ` Will Deacon
@ 2026-06-15 15:27 ` Sven Peter
2026-06-17 19:23 ` Yureka Lilian
1 sibling, 0 replies; 8+ messages in thread
From: Sven Peter @ 2026-06-15 15:27 UTC (permalink / raw)
To: Will Deacon, Yureka Lilian
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, asahi,
Sasha Finkelstein
On 15.06.26 17:02, Will Deacon wrote:
> On Mon, Jun 15, 2026 at 02:21:36PM +0200, Yureka Lilian wrote:
>> Apple Silicon CPUs can lose register state in WFI, leading to crashes
>> in the idle loop early in the boot process.
>> This applies to any previous Apple Silicon CPUs too, but is worked
>> around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
>> during m1n1's chickens setup.
>> This workaround no longer exists since M4.
>>
>> Add a workaround capability for replacing wfi and wfit with nop, and
>> an erratum to enable it on the affected CPUs if the workaround using the
>> sysreg is not already applied. Leave the decision whether the sysreg
>> workaround can be used up to the earlier parts of the boot chain which
>> already configure the Apple Silicon chicken bits.
>>
>> This alternative has to be applied in early boot, since otherwise some
>> cores might enter the idle loop before apply_alternatives_all() is run.
>>
>> Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
>> Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
>> ---
>> Changes since v1:
>> Restricted the erratum to EL2 only, since in EL1 we'd expect the
>> hypervisor to trap WFI and handle the erratum.
>>
>> Tested on M4 and M4 Pro (which now sometimes nondeterministically
>> crash later during boot).
>> Successfully booted on M3 Max with the SYS_IMP_APL_CYC_OVRD
>> workaround disabled in the bootloader, as well as A18 Pro (which,
>> like M4 / M4 Pro, doesn't have SYS_IMP_APL_CYC_OVRD).
>>
>> There is probably a better place for the SYS_IMP_APL_CYC_OVRD
>> defines, which I currently put in the middle of cpu_errata.c, but I
>> wouldn't know where.
>> ---
>> arch/arm64/Kconfig | 12 ++++++++++++
>> arch/arm64/include/asm/barrier.h | 19 ++++++++++++++++---
>> arch/arm64/kernel/cpu_errata.c | 21 +++++++++++++++++++++
>> arch/arm64/tools/cpucaps | 1 +
>> 4 files changed, 50 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b3afe0688919..8c8ff069856f 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -453,6 +453,18 @@ config AMPERE_ERRATUM_AC04_CPU_23
>>
>> If unsure, say Y.
>>
>> +config APPLE_ERRATUM_WFI_STATE
>> + bool "Apple Silicon: WFI loses state"
>> + default y
>> + help
>> + This option adds an alternative code sequence to work around some
>> + Apple Silicon CPUs losing register state during wfi and wfit
>> + instructions.
>> +
>> + As a workaround, the wfi and wfit instructions are replaced with nop
>> + operations via the alternative framework if an affected CPU is
>> + detected.
>> +
>> config ARM64_WORKAROUND_CLEAN_CACHE
>> bool
>>
>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> index 9495c4441a46..f72eddc7c434 100644
>> --- a/arch/arm64/include/asm/barrier.h
>> +++ b/arch/arm64/include/asm/barrier.h
>> @@ -20,9 +20,22 @@
>> #define wfe() asm volatile("wfe" : : : "memory")
>> #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
>> : : "r" (val) : "memory")
>> -#define wfi() asm volatile("wfi" : : : "memory")
>> -#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
>> - : : "r" (val) : "memory")
>> +#define wfi() \
>> + do { \
>> + asm volatile( \
>> + ALTERNATIVE("wfi", \
>> + "nop", \
>> + ARM64_WORKAROUND_WFI_STATE) \
>> + : : : "memory"); \
>> + } while (0)
>> +#define wfit(val) \
>> + do { \
>> + asm volatile( \
>> + ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
>> + "nop", \
>> + ARM64_WORKAROUND_WFI_STATE) \
>> + : : "r" (val) : "memory"); \
>> + } while (0)
>
> How can you guarantee that we don't run one of these prior to patching?
>
> I wonder if we're better off doing something like x86 and having an "idle="
> cmdline option. which could choose between e.g. nop, wfe, wfi and yield, for
> example? In fact, would wfe be a better choice than nop for you?
I think that should also solve the issue of detecting if we're running
on bare metal and need this or if there's something that mitigates the
issue (e.g. M1-M3 where we can still enable that chicken bit or a
hypervisor that just traps wfi): Just let the bootloader decide and
inject that cmdline option.
Best,
Sven
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-15 15:02 ` Will Deacon
2026-06-15 15:27 ` Sven Peter
@ 2026-06-17 19:23 ` Yureka Lilian
2026-06-19 9:24 ` Will Deacon
2026-06-19 10:38 ` Mark Rutland
1 sibling, 2 replies; 8+ messages in thread
From: Yureka Lilian @ 2026-06-17 19:23 UTC (permalink / raw)
To: Will Deacon, Yureka Lilian
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, asahi,
Sasha Finkelstein
On 6/15/26 17:02, Will Deacon wrote:
> On Mon, Jun 15, 2026 at 02:21:36PM +0200, Yureka Lilian wrote:
>> Apple Silicon CPUs can lose register state in WFI, leading to crashes
>> in the idle loop early in the boot process.
>> This applies to any previous Apple Silicon CPUs too, but is worked
>> around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
>> during m1n1's chickens setup.
>> This workaround no longer exists since M4.
>>
>> Add a workaround capability for replacing wfi and wfit with nop, and
>> an erratum to enable it on the affected CPUs if the workaround using the
>> sysreg is not already applied. Leave the decision whether the sysreg
>> workaround can be used up to the earlier parts of the boot chain which
>> already configure the Apple Silicon chicken bits.
>>
>> This alternative has to be applied in early boot, since otherwise some
>> cores might enter the idle loop before apply_alternatives_all() is run.
>>
>> Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
>> Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
>> ---
>> Changes since v1:
>> Restricted the erratum to EL2 only, since in EL1 we'd expect the
>> hypervisor to trap WFI and handle the erratum.
>>
>> Tested on M4 and M4 Pro (which now sometimes nondeterministically
>> crash later during boot).
>> Successfully booted on M3 Max with the SYS_IMP_APL_CYC_OVRD
>> workaround disabled in the bootloader, as well as A18 Pro (which,
>> like M4 / M4 Pro, doesn't have SYS_IMP_APL_CYC_OVRD).
>>
>> There is probably a better place for the SYS_IMP_APL_CYC_OVRD
>> defines, which I currently put in the middle of cpu_errata.c, but I
>> wouldn't know where.
>> ---
>> arch/arm64/Kconfig | 12 ++++++++++++
>> arch/arm64/include/asm/barrier.h | 19 ++++++++++++++++---
>> arch/arm64/kernel/cpu_errata.c | 21 +++++++++++++++++++++
>> arch/arm64/tools/cpucaps | 1 +
>> 4 files changed, 50 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b3afe0688919..8c8ff069856f 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -453,6 +453,18 @@ config AMPERE_ERRATUM_AC04_CPU_23
>>
>> If unsure, say Y.
>>
>> +config APPLE_ERRATUM_WFI_STATE
>> + bool "Apple Silicon: WFI loses state"
>> + default y
>> + help
>> + This option adds an alternative code sequence to work around some
>> + Apple Silicon CPUs losing register state during wfi and wfit
>> + instructions.
>> +
>> + As a workaround, the wfi and wfit instructions are replaced with nop
>> + operations via the alternative framework if an affected CPU is
>> + detected.
>> +
>> config ARM64_WORKAROUND_CLEAN_CACHE
>> bool
>>
>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> index 9495c4441a46..f72eddc7c434 100644
>> --- a/arch/arm64/include/asm/barrier.h
>> +++ b/arch/arm64/include/asm/barrier.h
>> @@ -20,9 +20,22 @@
>> #define wfe() asm volatile("wfe" : : : "memory")
>> #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
>> : : "r" (val) : "memory")
>> -#define wfi() asm volatile("wfi" : : : "memory")
>> -#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
>> - : : "r" (val) : "memory")
>> +#define wfi() \
>> + do { \
>> + asm volatile( \
>> + ALTERNATIVE("wfi", \
>> + "nop", \
>> + ARM64_WORKAROUND_WFI_STATE) \
>> + : : : "memory"); \
>> + } while (0)
>> +#define wfit(val) \
>> + do { \
>> + asm volatile( \
>> + ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
>> + "nop", \
>> + ARM64_WORKAROUND_WFI_STATE) \
>> + : : "r" (val) : "memory"); \
>> + } while (0)
> How can you guarantee that we don't run one of these prior to patching?
We can't, but there are a few points to our advantage, namely the boot
cpu isn't actually affected by this (when the CYC_OVRD bits are not
configured or not supported), and first round of patching happens quite
early before the other cpus are started.
>
> I wonder if we're better off doing something like x86 and having an "idle="
> cmdline option. which could choose between e.g. nop, wfe, wfi and yield, for
> example?
This is a good idea.
We also considered using nohlt, however this doesn't disable the use of
wfit in __delay (which I confirmed to also lose register state with
sufficiently long timeout values), and it would prevent us from enabling
a psci based idle later. So it seems a new parameter is needed.
> In fact, would wfe be a better choice than nop for you?
Regarding wfe: Simply replacing the wfi with wfe on these particular
machines leads to them getting stuck in the boot process (entering wfe
on the boot core and never waking up again), maybe because some kinds of
interrupts do not count as as events for the wfe wake-up?
>
> Will
Best regards,
--
Yureka Lilian<yureka@cyberchaos.dev>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-17 19:23 ` Yureka Lilian
@ 2026-06-19 9:24 ` Will Deacon
2026-06-19 10:38 ` Mark Rutland
1 sibling, 0 replies; 8+ messages in thread
From: Will Deacon @ 2026-06-19 9:24 UTC (permalink / raw)
To: Yureka Lilian
Cc: Catalin Marinas, linux-arm-kernel, linux-kernel, asahi,
Sasha Finkelstein
On Wed, Jun 17, 2026 at 09:23:03PM +0200, Yureka Lilian wrote:
> On 6/15/26 17:02, Will Deacon wrote:
> > In fact, would wfe be a better choice than nop for you?
>
> Regarding wfe: Simply replacing the wfi with wfe on these particular
> machines leads to them getting stuck in the boot process (entering wfe on
> the boot core and never waking up again), maybe because some kinds of
> interrupts do not count as as events for the wfe wake-up?
argh, I had forgotten that a pending masked interrupt doesn't wake up
a WFE (it does wake up a WFI).
So we probably shouldn't have wfe in the list of idle instruction choices
(unless we want to rely on the eventstream).
Will
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-17 19:23 ` Yureka Lilian
2026-06-19 9:24 ` Will Deacon
@ 2026-06-19 10:38 ` Mark Rutland
2026-06-19 12:40 ` Sven Peter
1 sibling, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2026-06-19 10:38 UTC (permalink / raw)
To: Yureka Lilian
Cc: Will Deacon, Catalin Marinas, linux-arm-kernel, linux-kernel,
asahi, Sasha Finkelstein
On Wed, Jun 17, 2026 at 09:23:03PM +0200, Yureka Lilian wrote:
> On 6/15/26 17:02, Will Deacon wrote:
> > On Mon, Jun 15, 2026 at 02:21:36PM +0200, Yureka Lilian wrote:
> > > Apple Silicon CPUs can lose register state in WFI, leading to crashes
> > > in the idle loop early in the boot process.
> > > This applies to any previous Apple Silicon CPUs too, but is worked
> > > around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
> > > during m1n1's chickens setup.
> > > This workaround no longer exists since M4.
Are we *certain* that there's no equivalent control elsewhere? i.e. this
hasn't just moved?
> > > Add a workaround capability for replacing wfi and wfit with nop, and
> > > an erratum to enable it on the affected CPUs if the workaround using the
> > > sysreg is not already applied. Leave the decision whether the sysreg
> > > workaround can be used up to the earlier parts of the boot chain which
> > > already configure the Apple Silicon chicken bits.
> > >
> > > This alternative has to be applied in early boot, since otherwise some
> > > cores might enter the idle loop before apply_alternatives_all() is run.
> > >
> > > Reviewed-by: Sasha Finkelstein <k@chaosmail.tech>
> > > Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
> > > ---
> > > Changes since v1:
> > > Restricted the erratum to EL2 only, since in EL1 we'd expect the
> > > hypervisor to trap WFI and handle the erratum.
The KVM portion doesn't seem to be implemented in this patch, so we
can't rely on that as-is.
[...]
> > > #define wfe() asm volatile("wfe" : : : "memory")
> > > #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \
> > > : : "r" (val) : "memory")
> > > -#define wfi() asm volatile("wfi" : : : "memory")
> > > -#define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \
> > > - : : "r" (val) : "memory")
> > > +#define wfi() \
> > > + do { \
> > > + asm volatile( \
> > > + ALTERNATIVE("wfi", \
> > > + "nop", \
> > > + ARM64_WORKAROUND_WFI_STATE) \
> > > + : : : "memory"); \
> > > + } while (0)
> > > +#define wfit(val) \
> > > + do { \
> > > + asm volatile( \
> > > + ALTERNATIVE("msr s0_3_c1_c0_1, %0", \
> > > + "nop", \
> > > + ARM64_WORKAROUND_WFI_STATE) \
> > > + : : "r" (val) : "memory"); \
> > > + } while (0)
> > How can you guarantee that we don't run one of these prior to patching?
>
> We can't, but there are a few points to our advantage, namely the boot cpu
> isn't actually affected by this (when the CYC_OVRD bits are not configured
> or not supported), and first round of patching happens quite early before
> the other cpus are started.
I think you're saying that:
* On the boot CPU, WFI *never* loses register state.
* On other CPUs, WFI *might* lose register state (and this cannot be
inhibited).
Is that understanding correct, or are there other conditions where a WFI
on the boot CPU can lose register state?
IIRC kdump doesn't ensure the new kernel is started on the boot CPU, so
I think that would be broken. I guess you can't kexec generally due to a
lack of offlining of secondary CPUs.
Mark.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] arm64: errata: Handle Apple WFI State Loss
2026-06-19 10:38 ` Mark Rutland
@ 2026-06-19 12:40 ` Sven Peter
0 siblings, 0 replies; 8+ messages in thread
From: Sven Peter @ 2026-06-19 12:40 UTC (permalink / raw)
To: Mark Rutland, Yureka Lilian
Cc: Will Deacon, Catalin Marinas, linux-arm-kernel, linux-kernel,
asahi, Sasha Finkelstein
On 6/19/26 12:38, Mark Rutland wrote:
> On Wed, Jun 17, 2026 at 09:23:03PM +0200, Yureka Lilian wrote:
>> On 6/15/26 17:02, Will Deacon wrote:
>>> On Mon, Jun 15, 2026 at 02:21:36PM +0200, Yureka Lilian wrote:
>>>> Apple Silicon CPUs can lose register state in WFI, leading to crashes
>>>> in the idle loop early in the boot process.
>>>> This applies to any previous Apple Silicon CPUs too, but is worked
>>>> around by configuring the WFI mode in SYS_IMP_APL_CYC_OVRD sysreg
>>>> during m1n1's chickens setup.
>>>> This workaround no longer exists since M4.
>
> Are we *certain* that there's no equivalent control elsewhere? i.e. this
> hasn't just moved?
We are as certain as we can be short of Apple confirming this which
isn't going to happen.
XNU has a helper function to "force wfi to use clock gating only" [1]
which is how we learned about this control originally on M1.
This has been disabled starting with M4 using the "NO_CPU_OVRD" define
which they describe as "CPU_OVRD register accesses are banned" [2]. If
there was an equivalent control elsewhere I would expect them to just
use that one instead.
In addition most non-architectural sysregs are read-only starting with
M4 in the non-Apple-entitled boot mode so even if there was such a
control we would likely not be able to access it.
[1]
https://github.com/apple-oss-distributions/xnu/blob/f6217f891ac0bb64f3d375211650a4c1ff8ca1ea/osfmk/arm64/machine_routines_asm.s#L1129
[2]
https://github.com/apple-oss-distributions/xnu/blob/f6217f891ac0bb64f3d375211650a4c1ff8ca1ea/pexpert/pexpert/arm64/board_config.h#L197
>
>>>> Add a workaround capability for replacing wfi and wfit with nop, and
>>>> an erratum to enable it on the affected CPUs if the workaround using the
>>>> sysreg is not already applied. Leave the decision whether the sysreg
[...]
>>>> + } while (0)
>>> How can you guarantee that we don't run one of these prior to patching?
>>
>> We can't, but there are a few points to our advantage, namely the boot cpu
>> isn't actually affected by this (when the CYC_OVRD bits are not configured
>> or not supported), and first round of patching happens quite early before
>> the other cpus are started.
>
> I think you're saying that:
>
> * On the boot CPU, WFI *never* loses register state.
>
> * On other CPUs, WFI *might* lose register state (and this cannot be
> inhibited).
>
> Is that understanding correct, or are there other conditions where a WFI
> on the boot CPU can lose register state?
Those are our current observations, yes. We don't know why the boot CPU
behaves differently and there no differences in any Apple sysregs that
would explain it.
But looking at all wfis in the kernel there are bunch in head.S and
similar for infinite loops where we don't care if register state is
lost. The only two that currently matter are a wfit in __delay and the
wfi in the idle loop.
The __delay one gets enabled after arm64_features are found which
happens just before arm64_errata from setup_boot_cpu_features() and
there's no __delay call inbetween that and when alternatives are
applied. If we follow Will's suggestion with an early_param that happens
much earlier as well.
My understanding is that the idle loop won't be reached before
sched_init() and that also happens much later.
>
> IIRC kdump doesn't ensure the new kernel is started on the boot CPU, so
> I think that would be broken. I guess you can't kexec generally due to a
> lack of offlining of secondary CPUs.
Next to that, kexec also runs into issues with all the various
co-processors which we can't easily reset or shut down once they've been
brought up once.
Sven
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-19 12:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-15 12:21 [PATCH v2] arm64: errata: Handle Apple WFI State Loss Yureka Lilian
2026-06-15 12:59 ` Nick Chan
2026-06-15 15:02 ` Will Deacon
2026-06-15 15:27 ` Sven Peter
2026-06-17 19:23 ` Yureka Lilian
2026-06-19 9:24 ` Will Deacon
2026-06-19 10:38 ` Mark Rutland
2026-06-19 12:40 ` Sven Peter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.