From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E6A2C433F5 for ; Thu, 17 Mar 2022 10:17:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232504AbiCQKSh (ORCPT ); Thu, 17 Mar 2022 06:18:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232419AbiCQKSg (ORCPT ); Thu, 17 Mar 2022 06:18:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10326E0985 for ; Thu, 17 Mar 2022 03:17:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 751BD617D5 for ; Thu, 17 Mar 2022 10:17:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D19A3C340E9; Thu, 17 Mar 2022 10:17:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1647512238; bh=Ac3+Zu4JJhOiqtXO0r903zb8S377HU8RCpuDNXxEp2w=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=LIxewhOO7mTDGWxC+uGpYafMzpVkySVAs6tXhhapSeaXVim+9LNI7555bLoYnApeN n/b2xuUZpQXwKY6McT6P4ScoLs/JLLx3pwXXq7M134j0HTVMzwXA2/Du06nIK4tVic f1KAqb2NfmPLZLGIGXuTEkdeUspaVpScyjosUHPdXBvG47ifksIqpkLepnizo6vvTp aPV/DWF4VtB7JBQlew+Nw8j21TwmRnn7jq+5cAIMPabcKFrW86OpzB2Vu9/l1H8hu4 gWla4KDA67C4/j+yzLIlGunHoH6Q8zwat3WwZ0BQSDuWm/u4tu7I2OGui8WaD049xG +fHeBdE3Df2/g== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nUnC0-00F92j-GX; Thu, 17 Mar 2022 10:17:16 +0000 Date: Thu, 17 Mar 2022 10:17:16 +0000 Message-ID: <87v8wcyjbn.wl-maz@kernel.org> From: Marc Zyngier To: Jingyi Wang Cc: , , , "wanghaibin.wang@huawei.com" , "yuzenghui@huawei.com" , , , Subject: Re: Report an error on GICv4.1 vcpu de-schedule In-Reply-To: <4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com> References: <4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: wangjingyi11@huawei.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, wanghaibin.wang@huawei.com, yuzenghui@huawei.com, Martin.Weidmann@arm.com, tangnianyao@huawei.com, chengjian8@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Jingyi, On Thu, 17 Mar 2022 07:27:45 +0000, Jingyi Wang wrote: >=20 > Hi Marc=EF=BC=8C >=20 > The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty > bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of > GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing > ready in kvm_vgic_flush_hwstate() for better performance. >=20 > Most time it works, but we have met an error on our hardware recently. > In preemptable kernel, the vcpu can be preempted between vcpu_load and > kvm_vgic_flush_hwstate. As a result, it get de-scheduled and > its_clear_vpend_valid() is called >=20 > val =3D gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER); > val &=3D ~GICR_VPENDBASER_Valid; > val &=3D ~clr; > val |=3D set; > gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER); >=20 >=20 > The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty > maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling > fail and report ""ITS virtual pending table not cleaning". >=20 > We have communicated with Martin from ARM and get the conclusion > that we should not change valid bit while the dirty bit not clear=E2=80= =94=E2=80=94 > "The dirty bit reports whether the last schedule /de-schedule > operation has completed.The restriction on not changing Valid when Dirty > is 1, is so that hardware can always complete the last operation for > starting the next". Indeed, the spec is crystal clear about that, and clearing Valid while Dirty is set is plain wrong. >=20 > I think maybe we can check dirty bit clear before clearing the valid bit > in its_clear_vpend_valid() code. Hope to know your opinion about this > issue. Yes, that's what should happen. I came up with the patch below. Please give it a shot and let me know if that helps. If it does, I'll queue it as a fix. Thanks, M. =46rom c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Thu, 17 Mar 2022 09:49:02 +0000 Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling The way KVM drives GICv4.{0,1} is as follows: - vcpu_load() makes the VPE resident, instructing the RD to start scanning for interrupts - just before entering the guest, we check that the RD has finished scanning and that we can start running the vcpu - on preemption, we deschedule the VPE by making it invalid on the RD However, we are preemptible between the first two steps. If it so happens *and* that the RD was still scanning, we nonetheless write to the GICR_VPENDBASER register while Dirty is set, and bad things happen (we're in UNPRED land). This affects both the 4.0 and 4.1 implementations. Make sure Dirty is cleared before performing the deschedule, meaning that its_clear_vpend_valid() becomes a sort of full VPE residency barrier. Reported-by: Jingyi Wang Signed-off-by: Marc Zyngier Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.= Dirty bit") Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei= .com --- drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-= its.c index 9e93ff2b6375..c9b1df980899 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void) return 0; } =20 -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set) +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base) { u32 count =3D 1000000; /* 1s! */ bool clean; u64 val; =20 - val =3D gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER); - val &=3D ~GICR_VPENDBASER_Valid; - val &=3D ~clr; - val |=3D set; - gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER); - do { val =3D gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER); clean =3D !(val & GICR_VPENDBASER_Dirty); @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi= _base, u64 clr, u64 set) } } while (!clean && count); =20 - if (unlikely(val & GICR_VPENDBASER_Dirty)) { + if (unlikely(!clean)) pr_err_ratelimited("ITS virtual pending table not cleaning\n"); + + return val; +} + +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set) +{ + u64 val; + + /* Make sure we wait until the RD is done with the initial scan */ + val =3D read_vpend_dirty_clear(vlpi_base); + val &=3D ~GICR_VPENDBASER_Valid; + val &=3D ~clr; + val |=3D set; + gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER); + + val =3D read_vpend_dirty_clear(vlpi_base); + if (unlikely(val & GICR_VPENDBASER_Dirty)) val |=3D GICR_VPENDBASER_PendingLast; - } =20 return val; } --=20 2.34.1 --=20 Without deviation from the norm, progress is not possible.