All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Ben Horgan <ben.horgan@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oupton@kernel.org>,
	Zenghui Yu <yuzenghui@huawei.com>, Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Hyesoo Yu <hyesoo.yu@samsung.com>,
	Quentin Perret <qperret@google.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] arm64: Force the use of CNTVCT_EL0 in __delay()
Date: Mon, 23 Feb 2026 14:31:44 +0000	[thread overview]
Message-ID: <86ldgja5v3.wl-maz@kernel.org> (raw)
In-Reply-To: <aZw3EGs4rbQvbAzV@e134344.arm.com>

Hi Ben,

On Mon, 23 Feb 2026 11:16:32 +0000,
Ben Horgan <ben.horgan@arm.com> wrote:
> 
> Hi Marc,
> 
> On Fri, Feb 13, 2026 at 02:16:19PM +0000, Marc Zyngier wrote:
> > Quentin forwards a report from Hyesoo Yu, describing an interesting
> > problem with the use of WFxT in __delay() when a vcpu is loaded and
> > that KVM is *not* in VHE mode (either nVHE or hVHE).
> > 
> > In this case, CNTVOFF_EL2 is set to a non-zero value to reflect the
> > state of the guest virtual counter. At the same time, __delay() is
> > using get_cycles() to read the counter value, which is indirected to
> > reading CNTPCT_EL0.
> > 
> > The core of the issue is that WFxT is using the *virtual* counter,
> > while the kernel is using the physical counter, and that the offset
> > introduces a really bad discrepancy between the two.
> > 
> > Fix this by forcing the use of CNTVCT_EL0, making __delay() consistent
> > irrespective of the value of CNTVOFF_EL2.
> > 
> > Reported-by: Hyesoo Yu <hyesoo.yu@samsung.com>
> > Reported-by: Quentin Perret <qperret@google.com>
> > Reviewed-by: Quentin Perret <qperret@google.com>
> > Fixes: 7d26b0516a0df ("arm64: Use WFxT for __delay() when possible")
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Link: https://lore.kernel.org/r/ktosachvft2cgqd5qkukn275ugmhy6xrhxur4zqpdxlfr3qh5h@o3zrfnsq63od
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/arm64/lib/delay.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
> > index cb2062e7e2340..d02341303899e 100644
> > --- a/arch/arm64/lib/delay.c
> > +++ b/arch/arm64/lib/delay.c
> > @@ -23,9 +23,20 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops)
> >  	return (xloops * loops_per_jiffy * HZ) >> 32;
> >  }
> >  
> > +/*
> > + * Force the use of CNTVCT_EL0 in order to have the same base as WFxT.
> > + * This avoids some annoying issues when CNTVOFF_EL2 is not reset 0 on a
> > + * KVM host running at EL1 until we do a vcpu_put() on the vcpu. When
> > + * running at EL2, the effective offset is always 0.
> > + *
> > + * Note that userspace cannot change the offset behind our back either,
> > + * as the vcpu mutex is held as long as KVM_RUN is in progress.
> > + */
> > +#define __delay_cycles()	__arch_counter_get_cntvct_stable()
> 
> I'm seeing this CONFIG_DEBUG_PREEMPT warning, see below, when running 7.0-rc1 on
> FVP Base RevC. I haven't tried bisecting but it looks to be introduced by this
> change.
> 
> The calls are:
> 
> __this_cpu_read()
> erratum_handler()
> arch_timer_reg_read_stable()
> __arch_counter_get_cntvct_stable()
> __delay()
> 
> This silences the warning:
> 
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index f5794d50f51d..f07e4efa0d2b 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -24,14 +24,14 @@
>  #define has_erratum_handler(h)                                         \
>         ({                                                              \
>                 const struct arch_timer_erratum_workaround *__wa;       \
> -               __wa = __this_cpu_read(timer_unstable_counter_workaround); \
> +               __wa = raw_cpu_read(timer_unstable_counter_workaround); \
>                 (__wa && __wa->h);                                      \
>         })
>  
>  #define erratum_handler(h)                                             \
>         ({                                                              \
>                 const struct arch_timer_erratum_workaround *__wa;       \
> -               __wa = __this_cpu_read(timer_unstable_counter_workaround); \
> +               __wa = raw_cpu_read(timer_unstable_counter_workaround); \
>                 (__wa && __wa->h) ? ({ isb(); __wa->h;}) : arch_timer_##h; \
>         })

It does indeed silence it, but that's IMO the wrong thing to do since
you can end-up calling a workaround helper on the wrong CPU if
preempted.  If you look at how things were done before this patch, we
had:

get_cycles() -> arch_timer_read_counter() -> arch_counter_get_cntvct_stable()

Crucially, arch_counter_get_cntvct_stable() does disable preemption,
and we should preserve it. Something like this:

diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
index d02341303899e..25fb593f95b0c 100644
--- a/arch/arm64/lib/delay.c
+++ b/arch/arm64/lib/delay.c
@@ -32,7 +32,16 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops)
  * Note that userspace cannot change the offset behind our back either,
  * as the vcpu mutex is held as long as KVM_RUN is in progress.
  */
-#define __delay_cycles()	__arch_counter_get_cntvct_stable()
+static cycles_t __delay_cycles(void)
+{
+	cycles_t val;
+
+	preempt_disable();
+	val = __arch_counter_get_cntvct_stable();
+	preenpt_enable();
+
+	return val;
+}
 
 void __delay(unsigned long cycles)
 {

The question is whether there is a material benefit in replicating the
arch_timer_read_counter() indirection for the virtual counter in order
to not pay the price of preempt_disable() when we're on a non-broken
system (hopefully the vast majority of implementations).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2026-02-23 14:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-13 14:16 [PATCH] arm64: Force the use of CNTVCT_EL0 in __delay() Marc Zyngier
2026-02-19 13:27 ` Will Deacon
2026-02-23 11:16 ` Ben Horgan
2026-02-23 14:31   ` Marc Zyngier [this message]
2026-02-23 15:14     ` Ben Horgan
2026-02-25 22:36     ` Will Deacon
2026-02-26  8:16       ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ldgja5v3.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=ben.horgan@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=hyesoo.yu@samsung.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oupton@kernel.org \
    --cc=qperret@google.com \
    --cc=stable@vger.kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.