From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C60ACDB479 for ; Wed, 24 Jun 2026 07:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=VpV1gN08nKO8YCzhBJRd5RD/lOT6aLuB8BvZqIGFAEQ=; b=ayLzyNUSj6SQUXdDatkJ3c8/+9 LxNYfO55Hw1y24FBmwJNMSIurfZhYLjlVV741QA7cbJhLfcKcXQOUnX5rh5ilw7BhlmoFFS5SmnHb q33d7sy+4TsTF4JRizd5B3B2eLASuLZLIblG6VqZxIJs8tskQeir5ffEWhREzXb4ExEtrxi/B9rw9 Cvg5wc6Oc7AHTgj1ogcGGJG33m6aSaHFw1TW2q2DqNdQIn4ZURLmLfKzJ1g98CiCA9+aVxPuQCOXo y330uySMq1P/iO49cchM1uYr/2EO6PsnkOGKrHPgTAsaePtM6J0BTrsetv+j6KQxVMhm70qk44obq lUntmSpw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1wcHht-00000007H5G-2X5K; Wed, 24 Jun 2026 07:07:34 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.99.1 #2 (Red Hat Linux)) id 1wcHhr-00000007H4m-2vmN for linux-arm-kernel@lists.infradead.org; Wed, 24 Jun 2026 07:07:31 +0000 Received: from smtp.kernel.org (quasi.space.kernel.org [100.103.45.18]) by sea.source.kernel.org (Postfix) with ESMTP id 1025942D68; Wed, 24 Jun 2026 07:07:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E72D01F000E9; Wed, 24 Jun 2026 07:07:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782284850; bh=VpV1gN08nKO8YCzhBJRd5RD/lOT6aLuB8BvZqIGFAEQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=nsuonfua9Wo9YT5Z4UxNpyMKx0sqJvYUsfxwWr97uN+jUBF45OinbAad/NaKzkV7B wKqAqHtAsppqMI+9bm2PtU8oOQEMyh1LHjoNc94YecPRv/n65p5iLf7NCbmR2t18eO QsZxjR1YfbPI6/50PAUbk3/ZNtNeJGbSGJE/H6c5sYaI00zL6Q5/DXOivmKAfNJbWE 6w2uw29fhiYrZQecL+/otVrqvCg+5RDyZUxzeIeekOWmiRMYjPaUPZLRSQjSZqj+UF VJYgSlTwRMhfcCWiv9HmeU7rWSuy44/N+oFJvvc/EDFzY6EvATycASQp92jn5W71bo flGi2UPmn7LzQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wcHho-0000000FWti-2NWR; Wed, 24 Jun 2026 07:07:28 +0000 Date: Wed, 24 Jun 2026 08:07:28 +0100 Message-ID: <86o6h0quvj.wl-maz@kernel.org> From: Marc Zyngier To: Jinqian Yang Cc: , , , , , , , Subject: Re: [RFC PATCH] irqchip/gic-v3-its: enable dynamic MSI-X allocation In-Reply-To: <20260624025345.458387-1-yangjinqian1@huawei.com> References: <20260624025345.458387-1-yangjinqian1@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: yangjinqian1@huawei.com, lpieralisi@kernel.org, tglx@kernel.org, alex@shazbot.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, liuyonglong@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, 24 Jun 2026 03:53:45 +0100, Jinqian Yang wrote: > > On ARM64 platforms with GICv3 ITS, VFIO PCI passthrough currently > cannot dynamically allocate MSI-X vectors after MSI-X has been > enabled. When QEMU needs to extend the vector range, it must > disable MSI-X, free all interrupts, then re-enable with a larger > allocation. This creates an interrupt loss window for already-active > vectors. > > Consider HNS3 with RoCE: NIC and RDMA share one PCI device and > ITS DeviceID, with MSI-X vectors partitioned as NIC (lower range) > then RoCE (starting at base_vector = num_nic_msi). In VFIO > passthrough, loading hns_roce after hns3 forces QEMU to tear down > all interrupts before re-allocating the larger range. During this > process, NIC interrupts may be lost. Testing confirmed that this > occasionally occurs, causing the network port reset to fail. Well, that's what you get for not exposing differentiated functions. Eventually, you face the reality that this is a poor design. > > ITS_MSI_FLAGS_SUPPORTED lacks MSI_FLAG_PCI_MSIX_ALLOC_DYN, causing > pci_msix_can_alloc_dyn() to return false. VFIO then sets > has_dyn_msix=false and never clears VFIO_IRQ_INFO_NORESIZE for > MSI-X, keeping the old "disable and reallocate" behavior. > > The essential prerequisite for enabling this flag is the fix to > msi_prepare() call timing (commit 1396e89e09f0 ("genirq/msi: Move > prepare() call to per-device allocation")): msi_prepare() is > now called once at per-device domain creation with hwsize, so ITS > creates an ITT with sufficient capacity for all MSI-X vectors. > Without this fix, msi_prepare() was called per-allocation with > semi-random nvec, maybe resulting in an ITT too small for dynamic > vector addition. How is this paragraph relevant? The kernel has had this fix for over a year, and backporting this series is not something I plan to ever do. > > With this in place, dynamic MSI-X allocation works correctly: > msi_domain_alloc_irq_at() uses populate_alloc_info() to copy the > pre-prepared alloc_data without re-invoking msi_prepare(), so each > new vector simply gets a LPI entry in the already-allocated ITT, > without affecting existing vectors. > > Signed-off-by: Jinqian Yang > --- > drivers/irqchip/irq-gic-its-msi-parent.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-gic-its-msi-parent.c b/drivers/irqchip/irq-gic-its-msi-parent.c > index b9257103a999..b2b9d2068bb1 100644 > --- a/drivers/irqchip/irq-gic-its-msi-parent.c > +++ b/drivers/irqchip/irq-gic-its-msi-parent.c > @@ -18,7 +18,8 @@ > > #define ITS_MSI_FLAGS_SUPPORTED (MSI_GENERIC_FLAGS_MASK | \ > MSI_FLAG_PCI_MSIX | \ > - MSI_FLAG_MULTI_PCI_MSI) > + MSI_FLAG_MULTI_PCI_MSI | \ > + MSI_FLAG_PCI_MSIX_ALLOC_DYN) > > static int its_translate_frame_address(struct fwnode_handle *msi_node, phys_addr_t *pa) > { What has this been tested with? In which conditions? Thanks, M. -- Without deviation from the norm, progress is not possible.