From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A8B4BC3DA59 for ; Tue, 16 Jul 2024 10:30:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=dLZS/oHd8IPRgA2ud86IZLgUtDy1zt+71jtAyHLk8oA=; b=OsxuluwyyRFhnHaxrMRApMKSQC 1DSpHqxDzPTWCZVrANtG7sLjHCGpL9BBwHW2FGcbuyLxdIWHTplt/kvsxM48y6TQAPRHt87A55yMu BEzq6D6iNB08E2I6gLIUSjcuI49kmUVbJrY9XVV3+zZNA5PkDRk668e5zmYFCRbEMystmxQ6jSYdt OUlPHYPr2L2yhRje160w7A4imltMT/kjZpD555DtvoqK6lNogyQMCUxKS0lSV0fOlqiuu7IjMZcAJ XLUYzaD2ssOYuYuORz6g9yHlTcS35C8CIWO1uYZaAoQHEN0tU2YWm+vZPeaiO7CEWJ6K8d6czDlaT W8OCvTNg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sTfS2-0000000A2Ou-1aEe; Tue, 16 Jul 2024 10:30:30 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sTfRh-0000000A2Js-2ayg for linux-arm-kernel@lists.infradead.org; Tue, 16 Jul 2024 10:30:11 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id DDB096115C; Tue, 16 Jul 2024 10:30:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E84AC116B1; Tue, 16 Jul 2024 10:30:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1721125808; bh=QysKPyiFau2fw2f0p+pSgBtnoBOL8OjYbm8noFYeFeY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=rqyuFhODQa3Qr5Fjyld3vuMaw73GnEGfh7LiSlIU39EKJ6H6OuvH7DpkPRCVmIO2D fnSFR9X3Zd3yFE7NAMVIM+2zmmZsGd1KsjQp6yCBL3xqSVCTEDihI/N5+A7RUF6s3+ uEnXbXRkSpNlv8ngsrYtTQ1gXPRYCPKIAc+DnsxuVrWdRPgV4xz+ArhaF4bXvuOjLY bv5phdTOyK3wSd0wgm8N3BrCTJappg15Km3/XMLexZn8BOqN3HVoD5UUo/CpPrjCwT 6ZBAL7iRhgoDMIjI1yqvEGsLJELRBHmbW7ay/1NLHoCAK/tyaYw4EyEV26WZuuZRhk Lrq9p3Bow9nCw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sTfRd-00ClkI-MZ; Tue, 16 Jul 2024 11:30:05 +0100 Date: Tue, 16 Jul 2024 11:30:05 +0100 Message-ID: <86r0bt39zm.wl-maz@kernel.org> From: Marc Zyngier To: Johan Hovold Cc: Thomas Gleixner , LKML , linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, anna-maria@linutronix.de, shawnguo@kernel.org, s.hauer@pengutronix.de, festevam@gmail.com, bhelgaas@google.com, rdunlap@infradead.org, vidyas@nvidia.com, ilpo.jarvinen@linux.intel.com, apatel@ventanamicro.com, kevin.tian@intel.com, nipun.gupta@amd.com, den@valinux.co.jp, andrew@lunn.ch, gregory.clement@bootlin.com, sebastian.hesselbarth@gmail.com, gregkh@linuxfoundation.org, rafael@kernel.org, alex.williamson@redhat.com, will@kernel.org, lorenzo.pieralisi@arm.com, jgg@mellanox.com, ammarfaizi2@gnuweeb.org, robin.murphy@arm.com, lpieralisi@kernel.org, nm@ti.com, kristo@kernel.org, vkoul@kernel.org, okaya@kernel.org, agross@kernel.org, andersson@kernel.org, mark.rutland@arm.com, shameerali.kolothum.thodi@huawei.com, yuzenghui@huawei.com, shivamurthy.shastri@linutronix.de Subject: Re: [patch V4 00/21] genirq, irqchip: Convert ARM MSI handling to per device MSI domains In-Reply-To: References: <20240623142137.448898081@linutronix.de> <878qy26cd6.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.3 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: johan@kernel.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, anna-maria@linutronix.de, shawnguo@kernel.org, s.hauer@pengutronix.de, festevam@gmail.com, bhelgaas@google.com, rdunlap@infradead.org, vidyas@nvidia.com, ilpo.jarvinen@linux.intel.com, apatel@ventanamicro.com, kevin.tian@intel.com, nipun.gupta@amd.com, den@valinux.co.jp, andrew@lunn.ch, gregory.clement@bootlin.com, sebastian.hesselbarth@gmail.com, gregkh@linuxfoundation.org, rafael@kernel.org, alex.williamson@redhat.com, will@kernel.org, lorenzo.pieralisi@arm.com, jgg@mellanox.com, ammarfaizi2@gnuweeb.org, robin.murphy@arm.com, lpieralisi@kernel.org, nm@ti.com, kristo@kernel.org, vkoul@kernel.org, okaya@kernel.org, agross@kernel.org, andersson@kernel.org, mark.rutland@arm.com, shameerali.kolothum.thodi@huawei.com, yuzenghui@huawei.com, shivamurthy.shastri@linutronix.de X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240716_033009_771774_C8ED9F95 X-CRM114-Status: GOOD ( 39.31 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, 15 Jul 2024 15:10:01 +0100, Johan Hovold wrote: > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > On Mon, 15 Jul 2024 12:18:47 +0100, > > Johan Hovold wrote: > > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > > > This is version 4 of the series to convert ARM MSI handling over to > > > > per device MSI domains. > > > > This series only showed up in linux-next last Friday and broke interrupt > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > can confirm that the breakage is caused by commits: > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > wifi on one machine: > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > > > > > and backing up until the commit before 233db05bc37f makes the NVMe come > > > up again during boot on another. > > > > > > I have not tried to debug this further. > > > > I need a few things from you though, because you're not giving much to > > help you (and I'm travelling, which doesn't help). > > Yeah, this was just an early heads up. > > > Can you at least investigate what in ath11k_pci_alloc_msi() causes the > > wifi driver to be upset? Does it normally use a single MSI vector or > > MSI-X? How about your nVME device? > > It uses multiple vectors, but now it falls back to trying to allocate a > single one and even that fails with -ENOSPC: > > ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 > > Similar for the NVMe, it uses multiple vectors normally, but now only > the AER interrupts appears to be allocated for each controller and there > is a GICv3 interrupt for the NVMe: > > 208: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > 212: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > 214: 161 0 0 0 0 0 0 0 GICv3 562 Level nvme0q0, nvme0q1 > 215: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > That's an indication of the driver having failed its MSI allocation and gone back to INTx signalling. > Next boot, after disabling PCIe controller async probing, it's an MSI-X?!: > > 201: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > 203: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > 205: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > 206: 0 0 0 0 0 0 0 0 ITS-PCI-MSIX-0002:01:00.0 0 Edge nvme0q0 > So is this issue actually tied to the async probing? Does it always work if you disable it? > This time ath11k vector allocation succeeded, but the driver times out > eventually: > > [ 8.984619] ath11k_pci 0006:01:00.0: MSI vectors: 32 > [ 29.690841] ath11k_pci 0006:01:00.0: failed to power up mhi: -110 > [ 29.697136] ath11k_pci 0006:01:00.0: failed to start mhi: -110 > [ 29.703153] ath11k_pci 0006:01:00.0: failed to power up :-110 > [ 29.732144] ath11k_pci 0006:01:00.0: failed to create soc core: -110 > [ 29.738694] ath11k_pci 0006:01:00.0: failed to init core: -110 > [ 32.841758] ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -110 > > > It would also help if you could define the DEBUG symbol at the very > > top of irq-gic-v3-its.c and report the debug information that the ITS > > driver dumps. > > See below (with synchronous probing of the pcie controllers). I don't see much going wrong there, and the ITS driver correctly dishes out interrupts. I'll take the current -next for a ride on my own HW and see what happens. M. -- Without deviation from the norm, progress is not possible.