From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E4B2FEA4FC9 for ; Mon, 23 Feb 2026 14:31:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=VD1x48j6SCsoX8sBgwK8526ciPgxuMXZkr2HJty+2w4=; b=VSF791q5YtvzIMLM4OqXI+0Xon 7TyeXiMLpu4+MvAXOaxp+lbN2qfG9jGH1dr4DqlUKGm2ENgAVwEIlwzdHiwngxZHiGqb0X04v5bg9 kWd8MfzsYXdRaEOzk05kT6iVdQgq1dxZqXNQHS5u4Z5ixWuSf/oR5wM1VV5MCwdAQxbIy/Xo/VVkm xqKPsqhhOXoGFROwA29ZlwFqvNQFL0AVg8emPucUEwf2zHZoTdVKkBx07xBrp+84ytKOu1xToUOh2 iv+d3F1soujpWZDqZVgbj/tmuAjN8rhvCZVHct2JXrEKomxYPvRCo0BJfW6jGs1LncHqCP5zH9Wmo DVV1dCEw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vuWyV-00000000TCj-43Pd; Mon, 23 Feb 2026 14:31:51 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vuWyT-00000000TCM-10fI for linux-arm-kernel@lists.infradead.org; Mon, 23 Feb 2026 14:31:50 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id D63624197C; Mon, 23 Feb 2026 14:31:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE067C116C6; Mon, 23 Feb 2026 14:31:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771857107; bh=uvxjc3u+eEoVHR5qOt+YwCCEfRQPCFwxaSBSRePagcg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=FfjS5JM7d0Vm1pMm4tIB2CXORmjC64ca4v97ykS0PmAJcHZABjOCX5Dw/hEOxqchI dfDIoUlBiz1bwmFbHhHRCp93Zd7nPbrhhgpa5mUR14hYCKlXsn5zIO++RjktGGwrnp hkx3xxB7XQ4pD3gGy+vrtSvw4/5dm/wZEhYoHnDpEkRSfq2S2VmkXUJHuER8ApQxXT V/OBRd92HpQxoZckxOCTt2SIp24jLlZcNCSAYCQAIS7fzcJqWkkMXD1Qpvc012Rvbl vShAhJDiU6wuayERxJQtBuvYWCaPj4r8d7vA1uxDxT2NCrP3hcrVY/9MFJDg1wVsXE jsHPvF53Z4vuA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vuWyP-0000000D1O6-1IPN; Mon, 23 Feb 2026 14:31:45 +0000 Date: Mon, 23 Feb 2026 14:31:44 +0000 Message-ID: <86ldgja5v3.wl-maz@kernel.org> From: Marc Zyngier To: Ben Horgan Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, Joey Gouly , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Will Deacon , Catalin Marinas , Hyesoo Yu , Quentin Perret , stable@vger.kernel.org Subject: Re: [PATCH] arm64: Force the use of CNTVCT_EL0 in __delay() In-Reply-To: References: <20260213141619.1791283-1-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ben.horgan@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, oupton@kernel.org, yuzenghui@huawei.com, will@kernel.org, catalin.marinas@arm.com, hyesoo.yu@samsung.com, qperret@google.com, stable@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260223_063149_475152_6335688A X-CRM114-Status: GOOD ( 41.98 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Ben, On Mon, 23 Feb 2026 11:16:32 +0000, Ben Horgan wrote: > > Hi Marc, > > On Fri, Feb 13, 2026 at 02:16:19PM +0000, Marc Zyngier wrote: > > Quentin forwards a report from Hyesoo Yu, describing an interesting > > problem with the use of WFxT in __delay() when a vcpu is loaded and > > that KVM is *not* in VHE mode (either nVHE or hVHE). > > > > In this case, CNTVOFF_EL2 is set to a non-zero value to reflect the > > state of the guest virtual counter. At the same time, __delay() is > > using get_cycles() to read the counter value, which is indirected to > > reading CNTPCT_EL0. > > > > The core of the issue is that WFxT is using the *virtual* counter, > > while the kernel is using the physical counter, and that the offset > > introduces a really bad discrepancy between the two. > > > > Fix this by forcing the use of CNTVCT_EL0, making __delay() consistent > > irrespective of the value of CNTVOFF_EL2. > > > > Reported-by: Hyesoo Yu > > Reported-by: Quentin Perret > > Reviewed-by: Quentin Perret > > Fixes: 7d26b0516a0df ("arm64: Use WFxT for __delay() when possible") > > Signed-off-by: Marc Zyngier > > Link: https://lore.kernel.org/r/ktosachvft2cgqd5qkukn275ugmhy6xrhxur4zqpdxlfr3qh5h@o3zrfnsq63od > > Cc: stable@vger.kernel.org > > --- > > arch/arm64/lib/delay.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c > > index cb2062e7e2340..d02341303899e 100644 > > --- a/arch/arm64/lib/delay.c > > +++ b/arch/arm64/lib/delay.c > > @@ -23,9 +23,20 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops) > > return (xloops * loops_per_jiffy * HZ) >> 32; > > } > > > > +/* > > + * Force the use of CNTVCT_EL0 in order to have the same base as WFxT. > > + * This avoids some annoying issues when CNTVOFF_EL2 is not reset 0 on a > > + * KVM host running at EL1 until we do a vcpu_put() on the vcpu. When > > + * running at EL2, the effective offset is always 0. > > + * > > + * Note that userspace cannot change the offset behind our back either, > > + * as the vcpu mutex is held as long as KVM_RUN is in progress. > > + */ > > +#define __delay_cycles() __arch_counter_get_cntvct_stable() > > I'm seeing this CONFIG_DEBUG_PREEMPT warning, see below, when running 7.0-rc1 on > FVP Base RevC. I haven't tried bisecting but it looks to be introduced by this > change. > > The calls are: > > __this_cpu_read() > erratum_handler() > arch_timer_reg_read_stable() > __arch_counter_get_cntvct_stable() > __delay() > > This silences the warning: > > diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h > index f5794d50f51d..f07e4efa0d2b 100644 > --- a/arch/arm64/include/asm/arch_timer.h > +++ b/arch/arm64/include/asm/arch_timer.h > @@ -24,14 +24,14 @@ > #define has_erratum_handler(h) \ > ({ \ > const struct arch_timer_erratum_workaround *__wa; \ > - __wa = __this_cpu_read(timer_unstable_counter_workaround); \ > + __wa = raw_cpu_read(timer_unstable_counter_workaround); \ > (__wa && __wa->h); \ > }) > > #define erratum_handler(h) \ > ({ \ > const struct arch_timer_erratum_workaround *__wa; \ > - __wa = __this_cpu_read(timer_unstable_counter_workaround); \ > + __wa = raw_cpu_read(timer_unstable_counter_workaround); \ > (__wa && __wa->h) ? ({ isb(); __wa->h;}) : arch_timer_##h; \ > }) It does indeed silence it, but that's IMO the wrong thing to do since you can end-up calling a workaround helper on the wrong CPU if preempted. If you look at how things were done before this patch, we had: get_cycles() -> arch_timer_read_counter() -> arch_counter_get_cntvct_stable() Crucially, arch_counter_get_cntvct_stable() does disable preemption, and we should preserve it. Something like this: diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c index d02341303899e..25fb593f95b0c 100644 --- a/arch/arm64/lib/delay.c +++ b/arch/arm64/lib/delay.c @@ -32,7 +32,16 @@ static inline unsigned long xloops_to_cycles(unsigned long xloops) * Note that userspace cannot change the offset behind our back either, * as the vcpu mutex is held as long as KVM_RUN is in progress. */ -#define __delay_cycles() __arch_counter_get_cntvct_stable() +static cycles_t __delay_cycles(void) +{ + cycles_t val; + + preempt_disable(); + val = __arch_counter_get_cntvct_stable(); + preenpt_enable(); + + return val; +} void __delay(unsigned long cycles) { The question is whether there is a material benefit in replicating the arch_timer_read_counter() indirection for the virtual counter in order to not pay the price of preempt_disable() when we're on a non-broken system (hopefully the vast majority of implementations). Thanks, M. -- Without deviation from the norm, progress is not possible.