From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA7CDC7EE29 for ; Wed, 7 Jun 2023 08:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OM7otGbHPFrVc7GrW0eZkfsApCeAuE1uqjubhta9Gx8=; b=qcfN3jaCWf6Dqo wndmyqFiwWEPXuKZ0WpkSQyfP4nHx2JjQ837vi6Yr0BWqGrH1lb7SHOCRLo/MwHykFgFyHam932G5 bRxQ2zsQpY2oDzfG7BwoydXg3DofxKlqJpDCWwHjOcA0SX1b8MDmR1j3B0t1l3OOzPVQsOxMIIjd6 N1zWrFjSTLsZ/4tdU2pkK9mjKIp2dzbz8X1y/HOqTE2BTE1YtfKD2QLXZS7DLFTZ0yg70pN5uttun U78/5LpgaEbeY0jxB/AAVF9yiA411QqKh4S67nsvPXRhETu5y+VCs3+tPd4XT51wKRTezQlm3Uggt PsezUwXqm8unfdPzC14g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q6ofN-0050OY-0G; Wed, 07 Jun 2023 08:37:17 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1q6ofK-0050NW-0i for linux-arm-kernel@lists.infradead.org; Wed, 07 Jun 2023 08:37:15 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7427A60C67; Wed, 7 Jun 2023 08:37:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1353C433D2; Wed, 7 Jun 2023 08:37:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686127032; bh=fLcIJINBY2tIla5fnDbAixYa2pUls5zFI6qF4VH9tVY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=tdWcl/idUg8GvzBdIGE45jZBOpO9lzK6Vf3U80IsI5b9jezt1z57yjJW+yKbrhqPo TwKRaLRI8juew8GStEBlfgyMfyHUcpQ0SrLKFbQiuo0o0b90gbH2WjBvfKEqxrKtng 0onXc9KHC7SlqAY/f0A2yk9wysbN/BKuOfe9VbPI5osIKZe2LUnKXZMBWy8+7txCRY D0jS381rpayqlknMVdSqrkIAD7nQ0Ng68iqk3aTaEqZVA474mfuyBEONw6rJ6StDfE a/cGCoonB0UJtHuldObY7+ib5wLG5j6hVuDcMXowP0jh4h+TivCWEy7uG7pi39O+9I pfIHwSe7/VBqA== Received: from 152.5.30.93.rev.sfr.net ([93.30.5.152] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q6ofG-003R7h-6T; Wed, 07 Jun 2023 09:37:10 +0100 Date: Wed, 07 Jun 2023 09:37:08 +0100 Message-ID: <87h6rjoeh7.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton , Nathan Chancellor Cc: Jean-Philippe Brucker , james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: Re: [PATCH 1/4] KVM: arm64: vgic: Fix a circular locking issue In-Reply-To: References: <20230518100914.2837292-1-jean-philippe@linaro.org> <20230518100914.2837292-2-jean-philippe@linaro.org> <20230606221525.GA2269598@dev-arch.thelio-3990X> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 93.30.5.152 X-SA-Exim-Rcpt-To: oliver.upton@linux.dev, nathan@kernel.org, jean-philippe@linaro.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230607_013714_350328_B2B9CC9B X-CRM114-Status: GOOD ( 37.19 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, 07 Jun 2023 06:23:24 +0100, Oliver Upton wrote: > > Nathan, > > First and foremost, thanks for testing this. > > On Tue, Jun 06, 2023 at 03:15:25PM -0700, Nathan Chancellor wrote: > > My apologies if this has been addressed or reported somewhere, I did a > > search of lore.kernel.org and browsed the kvmarm archives and did not > > see anything. > > This is news to me, but even if it had already been reported there's > nothing wrong with bumping the issue. Makes it hard for us to bury our > heads in the sand :) AFAICT, this is the very first report of this problem. > > > After this change landed in 6.4-rc5 as commit 59112e9c390b > > ("KVM: arm64: vgic: Fix a circular locking issue"), my QEMU Fedora VM on > > my SolidRun Honeycomb fails to get to GRUB. > > [...] > > > I built a kernel with CONFIG_PROVE_LOCKING=y but I do not see any splats > > while this is occurring. Additionally, neither my Raspberry Pi 4 or my > > Ampere Altra system have any issues, so it is possible this could be a > > platform specific problem. I am more than happy to provide any > > additional information and test kernels and patches to help get to the > > bottom of this. My kernel configuration is attached. > > I was unable to reproduce the issues you're seeing on 6.4-rc5, but I > don't have any different machines from you available atm. Based on > your description it sounds like your VM was able to do _something_ > since it sounds like a few escape codes got out over serial... > I'm wondering if you're getting wedged somewhere on a VGIC MMIO access. > > We don't have a precise tracepoint for VGIC accesses, but kvm:kvm_mmio > should do the trick. So, given that you're the lucky winner at > reproducing this bug right now, do you mind collecting a dump from that > tracepoint and sharing the access that happens before your VM gets > wedged? > > Curious if Marc has any additional insight, since (unsurprisingly) he > has a lot more experience in dealing with the GIC than I. In the > meantime I'll stare at the locking flows and see if anything stands > out. RPI4 is GICv2 nVHE, the NXP machine is GICv3 nVHE, and the Altra is GICv3 VHE. Not sure this is relevant here, but that's one data point. Having been able to start the guest means that we should have fully initialised the GIC. So a lockup is likely be an interaction with the GIC emulation itself, either because we failed to release a lock during initialisation, or due to some logic error in the GIC emulation (which is not necessarily MMIO...). I've just given 6.4-rc5 a go on my Synquacer, which is the closest thing I have to Nathan's NXP box, and I can't spot anything odd. It would also help to get access to the EDK2 build. It wouldn't be the first time that a change in KVM breaks some EDK2 behaviour. Finally, on top of the traces that Oliver asked above, looking at where the QEMU vcpu threads are would be interesting (I assume they'd be sleeping in the kernel). Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel