public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH 0/3] arm64: WFxT fixes, take #2
@ 2026-02-26  8:22 Marc Zyngier
  2026-02-26  8:22 ` [PATCH 1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section Marc Zyngier
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Marc Zyngier @ 2026-02-26  8:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Ben Horgan,
	Daniel Lezcano

After my previous WFxT fix went in 7.0-rc1 as 29cc0f3aa7c64 ("arm64:
Force the use of CNTVCT_EL0 in __delay()"), Ben reported that it isn't
playing nice with preemption, due to the use of the timer workaround
percpu variable. This series tries to address this, and propose a
hopefully better alternative.

That alternative is in the form of a new "virtual counter" accessor,
available in the same way arch_timer_read_counter() is, except that it
is guaranteed to be the virtual counter. This helper (a function
pointer, really) is updated on each CPU boot in the same manner the
counter accessor is.

This is then plugged into the __delay() helper, providing the expected
guarantees (and resulting in a much nicer code gen).

Patches on top of -rc1.

Marc Zyngier (3):
  arm64: Fix sampling the "stable" virtual counter in preemptible
    section
  clocksource/drivers/arm_arch_timer: Expose a direct accessor for the
    virtual counter
  arm64: Convert __delay_cycles() to arch_timer_read_vcounter()

 arch/arm64/lib/delay.c               | 5 ++++-
 drivers/clocksource/arm_arch_timer.c | 5 +++++
 include/clocksource/arm_arch_timer.h | 1 +
 3 files changed, 10 insertions(+), 1 deletion(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
@ 2026-02-26  8:22 ` Marc Zyngier
  2026-02-26  8:22 ` [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter Marc Zyngier
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2026-02-26  8:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Ben Horgan,
	Daniel Lezcano

Ben reports that when running with CONFIG_DEBUG_PREEMPT, using
__arch_counter_get_cntvct_stable() results in well deserves warnings,
as we access a per-CPU variable without preemption disabled.

Fix the issue by disabling preemption on reading the counter. We can
probably do a lot better by not disabling preemption on systems that
do not require horrible workarounds to return a valid counter value,
but this plugs the issue for the time being.

Fixes: 29cc0f3aa7c64 ("arm64: Force the use of CNTVCT_EL0 in __delay()")
Reported-by: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/aZw3EGs4rbQvbAzV@e134344.arm.com
---
 arch/arm64/lib/delay.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
index d02341303899e..e278e060e78a9 100644
--- a/arch/arm64/lib/delay.c
+++ b/arch/arm64/lib/delay.c
@@ -32,7 +32,11 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops)
  * Note that userspace cannot change the offset behind our back either,
  * as the vcpu mutex is held as long as KVM_RUN is in progress.
  */
-#define __delay_cycles()	__arch_counter_get_cntvct_stable()
+static cycles_t notrace __delay_cycles(void)
+{
+	guard(preempt_notrace)();
+	return __arch_counter_get_cntvct_stable();
+}
 
 void __delay(unsigned long cycles)
 {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
  2026-02-26  8:22 ` [PATCH 1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section Marc Zyngier
@ 2026-02-26  8:22 ` Marc Zyngier
  2026-02-26 13:48   ` Ben Horgan
  2026-02-26  8:22 ` [PATCH 3/3] arm64: Convert __delay_cycles() to arch_timer_read_vcounter() Marc Zyngier
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-02-26  8:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Ben Horgan,
	Daniel Lezcano

We allow access to the architected counter via arch_timer_read_counter().
However, this accessor can either be the virtual or the physical
view of the counter, depending on how the kernel has been booted.

At the same time, we have some architectural features (such as WFIT,
WFET) that rely on the virtual counter, and nothing else.

If implementations were perfect, we'd rely on reading CNTVCT_EL0,
and be done with it. However, we have a bunch of broken implementations
in the wild, which rely on preemption being disabled and other
costly workarounds.

In order to provide decent performance on non-broken HW while still
supporting the legacy horrors, expose arch_timer_read_vcounter() as
a new helper that hides this complexity.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/clocksource/arm_arch_timer.c | 5 +++++
 include/clocksource/arm_arch_timer.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 90aeff44a2764..4e4a62e1c9439 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -137,6 +137,8 @@ static noinstr u64 arch_counter_get_cntvct(void)
 u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
 EXPORT_SYMBOL_GPL(arch_timer_read_counter);
 
+u64 (*arch_timer_read_vcounter)(void) __ro_after_init = arch_counter_get_cntvct;
+
 static u64 arch_counter_read(struct clocksource *cs)
 {
 	return arch_timer_read_counter();
@@ -931,6 +933,9 @@ static void __init arch_counter_register(void)
 	}
 
 	arch_timer_read_counter = rd;
+	arch_timer_read_vcounter = (arch_timer_counter_has_wa() ?
+				    arch_counter_get_cntvct_stable :
+				    arch_counter_get_cntvct);
 	clocksource_counter.vdso_clock_mode = vdso_default;
 
 	width = arch_counter_get_width();
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 2eda895f19f54..5bfd6a5db75aa 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -88,6 +88,7 @@ struct arch_timer_mem {
 
 extern u32 arch_timer_get_rate(void);
 extern u64 (*arch_timer_read_counter)(void);
+extern u64 (*arch_timer_read_vcounter)(void);
 extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
 extern bool arch_timer_evtstrm_available(void);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] arm64: Convert __delay_cycles() to arch_timer_read_vcounter()
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
  2026-02-26  8:22 ` [PATCH 1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section Marc Zyngier
  2026-02-26  8:22 ` [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter Marc Zyngier
@ 2026-02-26  8:22 ` Marc Zyngier
  2026-02-26 12:53 ` [PATCH 0/3] arm64: WFxT fixes, take #2 André Draszik
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2026-02-26  8:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Ben Horgan,
	Daniel Lezcano

Relax the need for disabling preemption in __delay_cycles() by
using arch_timer_read_vcounter(), which will disable preemption
only when this is actually required.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/lib/delay.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
index e278e060e78a9..a667df920697d 100644
--- a/arch/arm64/lib/delay.c
+++ b/arch/arm64/lib/delay.c
@@ -32,10 +32,9 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops)
  * Note that userspace cannot change the offset behind our back either,
  * as the vcpu mutex is held as long as KVM_RUN is in progress.
  */
-static cycles_t notrace __delay_cycles(void)
+static cycles_t __delay_cycles(void)
 {
-	guard(preempt_notrace)();
-	return __arch_counter_get_cntvct_stable();
+	return arch_timer_read_vcounter();
 }
 
 void __delay(unsigned long cycles)
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] arm64: WFxT fixes, take #2
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
                   ` (2 preceding siblings ...)
  2026-02-26  8:22 ` [PATCH 3/3] arm64: Convert __delay_cycles() to arch_timer_read_vcounter() Marc Zyngier
@ 2026-02-26 12:53 ` André Draszik
  2026-02-26 13:36 ` Ben Horgan
  2026-02-27  3:16 ` Will Deacon
  5 siblings, 0 replies; 11+ messages in thread
From: André Draszik @ 2026-02-26 12:53 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Ben Horgan,
	Daniel Lezcano

Hi,

On Thu, 2026-02-26 at 08:22 +0000, Marc Zyngier wrote:
> After my previous WFxT fix went in 7.0-rc1 as 29cc0f3aa7c64 ("arm64:
> Force the use of CNTVCT_EL0 in __delay()"), Ben reported that it isn't
> playing nice with preemption, due to the use of the timer workaround
> percpu variable. This series tries to address this, and propose a
> hopefully better alternative.
> 
> That alternative is in the form of a new "virtual counter" accessor,
> available in the same way arch_timer_read_counter() is, except that it
> is guaranteed to be the virtual counter. This helper (a function
> pointer, really) is updated on each CPU boot in the same manner the
> counter accessor is.
> 
> This is then plugged into the __delay() helper, providing the expected
> guarantees (and resulting in a much nicer code gen).
> 
> Patches on top of -rc1.
> 
> Marc Zyngier (3):
>   arm64: Fix sampling the "stable" virtual counter in preemptible
>     section
>   clocksource/drivers/arm_arch_timer: Expose a direct accessor for the
>     virtual counter
>   arm64: Convert __delay_cycles() to arch_timer_read_vcounter()
> 
>  arch/arm64/lib/delay.c               | 5 ++++-
>  drivers/clocksource/arm_arch_timer.c | 5 +++++
>  include/clocksource/arm_arch_timer.h | 1 +
>  3 files changed, 10 insertions(+), 1 deletion(-)

FWIW, I was observing similar traces from various drivers and can confirm that
for me, on Pixel 6, the warnings are gone again with these patches applied on
top of next-20260225

Tested-by: André Draszik <andre.draszik@linaro.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] arm64: WFxT fixes, take #2
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
                   ` (3 preceding siblings ...)
  2026-02-26 12:53 ` [PATCH 0/3] arm64: WFxT fixes, take #2 André Draszik
@ 2026-02-26 13:36 ` Ben Horgan
  2026-02-27  3:16 ` Will Deacon
  5 siblings, 0 replies; 11+ messages in thread
From: Ben Horgan @ 2026-02-26 13:36 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Daniel Lezcano

Hi Marc,

On 2/26/26 08:22, Marc Zyngier wrote:
> After my previous WFxT fix went in 7.0-rc1 as 29cc0f3aa7c64 ("arm64:
> Force the use of CNTVCT_EL0 in __delay()"), Ben reported that it isn't
> playing nice with preemption, due to the use of the timer workaround
> percpu variable. This series tries to address this, and propose a
> hopefully better alternative.
> 
> That alternative is in the form of a new "virtual counter" accessor,
> available in the same way arch_timer_read_counter() is, except that it
> is guaranteed to be the virtual counter. This helper (a function
> pointer, really) is updated on each CPU boot in the same manner the
> counter accessor is.
> 
> This is then plugged into the __delay() helper, providing the expected
> guarantees (and resulting in a much nicer code gen).
> 
> Patches on top of -rc1.
> 
> Marc Zyngier (3):
>   arm64: Fix sampling the "stable" virtual counter in preemptible
>     section
>   clocksource/drivers/arm_arch_timer: Expose a direct accessor for the
>     virtual counter
>   arm64: Convert __delay_cycles() to arch_timer_read_vcounter()
> 
>  arch/arm64/lib/delay.c               | 5 ++++-
>  drivers/clocksource/arm_arch_timer.c | 5 +++++
>  include/clocksource/arm_arch_timer.h | 1 +
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 

I checked with just the first patch applied and with all the patchs
applied. In both case I no longer see the preemption warnings on the s/w
model FVP Base C with CONFIG_DEBUG_PREEMPT.

Tested-by: Ben Horgan <ben.horgan@arm.com>

Thanks,

Ben



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter
  2026-02-26  8:22 ` [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter Marc Zyngier
@ 2026-02-26 13:48   ` Ben Horgan
  2026-02-26 14:03     ` Ben Horgan
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Horgan @ 2026-02-26 13:48 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Daniel Lezcano

Hi Marc,

On 2/26/26 08:22, Marc Zyngier wrote:
> We allow access to the architected counter via arch_timer_read_counter().
> However, this accessor can either be the virtual or the physical
> view of the counter, depending on how the kernel has been booted.
> 
> At the same time, we have some architectural features (such as WFIT,
> WFET) that rely on the virtual counter, and nothing else.
> 
> If implementations were perfect, we'd rely on reading CNTVCT_EL0,
> and be done with it. However, we have a bunch of broken implementations
> in the wild, which rely on preemption being disabled and other
> costly workarounds.
> 
> In order to provide decent performance on non-broken HW while still
> supporting the legacy horrors, expose arch_timer_read_vcounter() as
> a new helper that hides this complexity.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  drivers/clocksource/arm_arch_timer.c | 5 +++++
>  include/clocksource/arm_arch_timer.h | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 90aeff44a2764..4e4a62e1c9439 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -137,6 +137,8 @@ static noinstr u64 arch_counter_get_cntvct(void)
>  u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
>  EXPORT_SYMBOL_GPL(arch_timer_read_counter);
>  
> +u64 (*arch_timer_read_vcounter)(void) __ro_after_init = arch_counter_get_cntvct;
> +
>  static u64 arch_counter_read(struct clocksource *cs)
>  {
>  	return arch_timer_read_counter();
> @@ -931,6 +933,9 @@ static void __init arch_counter_register(void)
>  	}
>  
>  	arch_timer_read_counter = rd;
> +	arch_timer_read_vcounter = (arch_timer_counter_has_wa() ?

This matches what is done for arch_timer_read_counter but it seems a bit
surprising to me that arch_timer_counter_has_wa() is checking that the
workaround is in use and not whether the workaround should be in use. Do
we need to worry about what happens if the workaround fails to be enabled?

> +				    arch_counter_get_cntvct_stable :
> +				    arch_counter_get_cntvct);
>  	clocksource_counter.vdso_clock_mode = vdso_default;
>  
>  	width = arch_counter_get_width();
> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> index 2eda895f19f54..5bfd6a5db75aa 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -88,6 +88,7 @@ struct arch_timer_mem {
>  
>  extern u32 arch_timer_get_rate(void);
>  extern u64 (*arch_timer_read_counter)(void);
> +extern u64 (*arch_timer_read_vcounter)(void);
>  extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
>  extern bool arch_timer_evtstrm_available(void);
>  

Thanks,

Ben



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter
  2026-02-26 13:48   ` Ben Horgan
@ 2026-02-26 14:03     ` Ben Horgan
  2026-02-26 16:48       ` Marc Zyngier
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Horgan @ 2026-02-26 14:03 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel
  Cc: Will Deacon, Catalin Marinas, Mark Rutland, Daniel Lezcano



On 2/26/26 13:48, Ben Horgan wrote:
> Hi Marc,
> 
> On 2/26/26 08:22, Marc Zyngier wrote:
>> We allow access to the architected counter via arch_timer_read_counter().
>> However, this accessor can either be the virtual or the physical
>> view of the counter, depending on how the kernel has been booted.
>>
>> At the same time, we have some architectural features (such as WFIT,
>> WFET) that rely on the virtual counter, and nothing else.
>>
>> If implementations were perfect, we'd rely on reading CNTVCT_EL0,
>> and be done with it. However, we have a bunch of broken implementations
>> in the wild, which rely on preemption being disabled and other
>> costly workarounds.
>>
>> In order to provide decent performance on non-broken HW while still
>> supporting the legacy horrors, expose arch_timer_read_vcounter() as
>> a new helper that hides this complexity.
>>
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  drivers/clocksource/arm_arch_timer.c | 5 +++++
>>  include/clocksource/arm_arch_timer.h | 1 +
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
>> index 90aeff44a2764..4e4a62e1c9439 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -137,6 +137,8 @@ static noinstr u64 arch_counter_get_cntvct(void)
>>  u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
>>  EXPORT_SYMBOL_GPL(arch_timer_read_counter);
>>  
>> +u64 (*arch_timer_read_vcounter)(void) __ro_after_init = arch_counter_get_cntvct;
>> +
>>  static u64 arch_counter_read(struct clocksource *cs)
>>  {
>>  	return arch_timer_read_counter();
>> @@ -931,6 +933,9 @@ static void __init arch_counter_register(void)
>>  	}
>>  
>>  	arch_timer_read_counter = rd;
>> +	arch_timer_read_vcounter = (arch_timer_counter_has_wa() ?
> 
> This matches what is done for arch_timer_read_counter but it seems a bit
> surprising to me that arch_timer_counter_has_wa() is checking that the
> workaround is in use and not whether the workaround should be in use. Do
> we need to worry about what happens if the workaround fails to be enabled?

Or is the point that if you haven't enabled a relevant workaround then
all cores are treated the same and so there is no need to disable
preemption?

> 
>> +				    arch_counter_get_cntvct_stable :
>> +				    arch_counter_get_cntvct);
>>  	clocksource_counter.vdso_clock_mode = vdso_default;
>>  
>>  	width = arch_counter_get_width();
>> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
>> index 2eda895f19f54..5bfd6a5db75aa 100644
>> --- a/include/clocksource/arm_arch_timer.h
>> +++ b/include/clocksource/arm_arch_timer.h
>> @@ -88,6 +88,7 @@ struct arch_timer_mem {
>>  
>>  extern u32 arch_timer_get_rate(void);
>>  extern u64 (*arch_timer_read_counter)(void);
>> +extern u64 (*arch_timer_read_vcounter)(void);
>>  extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
>>  extern bool arch_timer_evtstrm_available(void);
>>  
> 
> Thanks,
> 
> Ben
> 
> 

Thanks,

Ben



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter
  2026-02-26 14:03     ` Ben Horgan
@ 2026-02-26 16:48       ` Marc Zyngier
  2026-02-26 18:09         ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-02-26 16:48 UTC (permalink / raw)
  To: Ben Horgan
  Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Mark Rutland,
	Daniel Lezcano

On Thu, 26 Feb 2026 14:03:36 +0000,
Ben Horgan <ben.horgan@arm.com> wrote:
> 
> 
> 
> On 2/26/26 13:48, Ben Horgan wrote:
> > Hi Marc,
> > 
> > On 2/26/26 08:22, Marc Zyngier wrote:
> >> We allow access to the architected counter via arch_timer_read_counter().
> >> However, this accessor can either be the virtual or the physical
> >> view of the counter, depending on how the kernel has been booted.
> >>
> >> At the same time, we have some architectural features (such as WFIT,
> >> WFET) that rely on the virtual counter, and nothing else.
> >>
> >> If implementations were perfect, we'd rely on reading CNTVCT_EL0,
> >> and be done with it. However, we have a bunch of broken implementations
> >> in the wild, which rely on preemption being disabled and other
> >> costly workarounds.
> >>
> >> In order to provide decent performance on non-broken HW while still
> >> supporting the legacy horrors, expose arch_timer_read_vcounter() as
> >> a new helper that hides this complexity.
> >>
> >> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >> ---
> >>  drivers/clocksource/arm_arch_timer.c | 5 +++++
> >>  include/clocksource/arm_arch_timer.h | 1 +
> >>  2 files changed, 6 insertions(+)
> >>
> >> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> >> index 90aeff44a2764..4e4a62e1c9439 100644
> >> --- a/drivers/clocksource/arm_arch_timer.c
> >> +++ b/drivers/clocksource/arm_arch_timer.c
> >> @@ -137,6 +137,8 @@ static noinstr u64 arch_counter_get_cntvct(void)
> >>  u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
> >>  EXPORT_SYMBOL_GPL(arch_timer_read_counter);
> >>  
> >> +u64 (*arch_timer_read_vcounter)(void) __ro_after_init = arch_counter_get_cntvct;
> >> +
> >>  static u64 arch_counter_read(struct clocksource *cs)
> >>  {
> >>  	return arch_timer_read_counter();
> >> @@ -931,6 +933,9 @@ static void __init arch_counter_register(void)
> >>  	}
> >>  
> >>  	arch_timer_read_counter = rd;
> >> +	arch_timer_read_vcounter = (arch_timer_counter_has_wa() ?
> > 
> > This matches what is done for arch_timer_read_counter but it seems a bit
> > surprising to me that arch_timer_counter_has_wa() is checking that the
> > workaround is in use and not whether the workaround should be in use. Do
> > we need to worry about what happens if the workaround fails to be enabled?
> 
> Or is the point that if you haven't enabled a relevant workaround then
> all cores are treated the same and so there is no need to disable
> preemption?

There are multiple things at play here:

- we cannot fail to enable a workaround. If we find one, we enable it.

- if no workaround are available, then there is no need to disable
  preemption, because the read of the counter is the same on all CPUs.

However, this code is a bug nest, and I just re-discovered an
interesting failure mode (boot on a sane CPU, keeping the broken CPUs
offline, online a broken CPU late, enjoy the fireworks).

Plus the fact that we don't indirect sched_clock(), which means we
never really enable a workaround if the boot CPU is not affected.

I have a small pile of hacks to address all of this, but I need to
convince myself that this is actually correct.

Stay tuned...

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter
  2026-02-26 16:48       ` Marc Zyngier
@ 2026-02-26 18:09         ` Will Deacon
  0 siblings, 0 replies; 11+ messages in thread
From: Will Deacon @ 2026-02-26 18:09 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Ben Horgan, linux-arm-kernel, Catalin Marinas, Mark Rutland,
	Daniel Lezcano

On Thu, Feb 26, 2026 at 04:48:05PM +0000, Marc Zyngier wrote:
> On Thu, 26 Feb 2026 14:03:36 +0000,
> Ben Horgan <ben.horgan@arm.com> wrote:
> > Or is the point that if you haven't enabled a relevant workaround then
> > all cores are treated the same and so there is no need to disable
> > preemption?
> 
> There are multiple things at play here:
> 
> - we cannot fail to enable a workaround. If we find one, we enable it.
> 
> - if no workaround are available, then there is no need to disable
>   preemption, because the read of the counter is the same on all CPUs.
> 
> However, this code is a bug nest, and I just re-discovered an
> interesting failure mode (boot on a sane CPU, keeping the broken CPUs
> offline, online a broken CPU late, enjoy the fireworks).

Damn, that was the scenario I was worried about :(

I couldn't see an alternative to disabling preemption for everybody
but my brain's not firing on all cylinders this week.

> Plus the fact that we don't indirect sched_clock(), which means we
> never really enable a workaround if the boot CPU is not affected.
> 
> I have a small pile of hacks to address all of this, but I need to
> convince myself that this is actually correct.
> 
> Stay tuned...

I'll keep an eye out!

Will


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] arm64: WFxT fixes, take #2
  2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
                   ` (4 preceding siblings ...)
  2026-02-26 13:36 ` Ben Horgan
@ 2026-02-27  3:16 ` Will Deacon
  5 siblings, 0 replies; 11+ messages in thread
From: Will Deacon @ 2026-02-27  3:16 UTC (permalink / raw)
  To: linux-arm-kernel, Marc Zyngier
  Cc: catalin.marinas, kernel-team, Will Deacon, Mark Rutland,
	Ben Horgan, Daniel Lezcano

On Thu, 26 Feb 2026 08:22:31 +0000, Marc Zyngier wrote:
> After my previous WFxT fix went in 7.0-rc1 as 29cc0f3aa7c64 ("arm64:
> Force the use of CNTVCT_EL0 in __delay()"), Ben reported that it isn't
> playing nice with preemption, due to the use of the timer workaround
> percpu variable. This series tries to address this, and propose a
> hopefully better alternative.
> 
> That alternative is in the form of a new "virtual counter" accessor,
> available in the same way arch_timer_read_counter() is, except that it
> is guaranteed to be the virtual counter. This helper (a function
> pointer, really) is updated on each CPU boot in the same manner the
> counter accessor is.
> 
> [...]

Applied first patch to arm64 (for-next/fixes), thanks!

[1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section
      https://git.kernel.org/arm64/c/e5cb94ba5f96

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-02-27  3:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26  8:22 [PATCH 0/3] arm64: WFxT fixes, take #2 Marc Zyngier
2026-02-26  8:22 ` [PATCH 1/3] arm64: Fix sampling the "stable" virtual counter in preemptible section Marc Zyngier
2026-02-26  8:22 ` [PATCH 2/3] clocksource/drivers/arm_arch_timer: Expose a direct accessor for the virtual counter Marc Zyngier
2026-02-26 13:48   ` Ben Horgan
2026-02-26 14:03     ` Ben Horgan
2026-02-26 16:48       ` Marc Zyngier
2026-02-26 18:09         ` Will Deacon
2026-02-26  8:22 ` [PATCH 3/3] arm64: Convert __delay_cycles() to arch_timer_read_vcounter() Marc Zyngier
2026-02-26 12:53 ` [PATCH 0/3] arm64: WFxT fixes, take #2 André Draszik
2026-02-26 13:36 ` Ben Horgan
2026-02-27  3:16 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox