From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22E2E35DA60; Tue, 31 Mar 2026 19:01:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774983665; cv=none; b=nAAiD/X8dqx24uD1o+F3FGO6CEp3vBvV5i8FdMGGOTl1A4p28Fz1G+eLtlSQzXnkGZC4hbHEdYsS4NjEsFpIU8AHNcJaSFdokpaM2qdl8hw0BfFyAWFiwM40cwJSBbQbn6UplLyf3fM8nciZe9P+zu2g1bcST5EqSMaLZ/W+V7M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774983665; c=relaxed/simple; bh=KSLOd98K1CK+1Z7brXY8QL31QPyNcPrQH0DGa7ob15s=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition:In-Reply-To; b=dCLZpmh2u1JYn7jNXQGUtbXQIKJyuyQFQvSlO0O6ccpp9sIF89A5iaUYaWxG4KDgd62itBQY/POerAnJMpDVynZv9as21oWLT70FkY+SXrvrIaQ42eem0D5CEH5TEMs3ngy1ZKtyx2J8HwTDTptew8/H7uQLjWYOf9fOXPYJ3FE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Qns8qZSd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Qns8qZSd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8235EC19423; Tue, 31 Mar 2026 19:01:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774983664; bh=KSLOd98K1CK+1Z7brXY8QL31QPyNcPrQH0DGa7ob15s=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=Qns8qZSd7O23U5meg+0akBEqFL5LkPGd3h0FJ+I5UZ01RjHWerLmVaxDp4rd0DZBS yQgnl/RzPKtJ77WCcFNGMSXoObHClC1nDdsGB/oww5ccTocv7zUspVW0pjJQZ04i5e UDlafcv0mfTryjVcYjiMMp0qmBlWeTPA1wTg+qg4AsrSjVtZ069gHRYItQuNDSBxFz 3ThXl7oD79dY4QGtjn9VSfN6i9e0DjjdYr07NPRm282rSxKGip/QTRyo89qgCKSsRq wD3v8FVw+dtdilOvl3DwzsONBOiDjN6NxqUdiRZ6HPiOnsVouqhlM+WNsU46l3US5n ZojjiHskKGllA== Date: Tue, 31 Mar 2026 14:01:03 -0500 From: Bjorn Helgaas To: "Kuehling, Felix" Cc: Gerd Bayer , Alex Deucher , Christian =?utf-8?B?S8O2bmln?= , Selvin Xavier , Kalesh AP , Jason Gunthorpe , Leon Romanovsky , Michal Kalderon , Saeed Mahameed , Tariq Toukan , Mark Bloch , Bjorn Helgaas , Jay Cornwall , Ilpo =?utf-8?B?SsOkcnZpbmVu?= , Christian Borntraeger , Niklas Schnelle , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Sven Schnelle , Alexander Schmidt , linux-s390@vger.kernel.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-rdma@vger.kernel.org Subject: Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs Message-ID: <20260331190103.GA150932@bhelgaas> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <688a8e59-3188-4c9e-a8ed-d53b7d965e10@amd.com> On Tue, Mar 31, 2026 at 02:39:26PM -0400, Kuehling, Felix wrote: > On 2026-03-31 14:09, Bjorn Helgaas wrote: > > On Mon, Mar 30, 2026 at 08:01:57PM -0400, Kuehling, Felix wrote: > > > On 2026-03-30 17:42, Bjorn Helgaas wrote: > > > > [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers] > > > > > > > > On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote: > > > > > Since root complex integrated end points (RCiEPs) attach to a bus that > > > > > has no bridge device describing the root port, the capability to > > > > > complete AtomicOps requests cannot be determined with PCIe methods. > > > > > > > > > > Change default of pci_enable_atomic_ops_to_root() to not enable > > > > > AtomicOps requests on RCiEPs. > > > > I know I suggested this because there's nothing explicit that tells us > > > > whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned > > > > that GPUs, infiniband HCAs, and NICs that use atomic ops may be > > > > implemented as RCiEPs and would be broken by this. > > > FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root. It > > > just assumes that the GPU can do atomic accesses because it doesn't actually > > > go through PCIe: https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785 > > What does this mean for the other branch that *does* use > > pci_enable_atomic_ops_to_root()? Can any of those devices be RCiEPs? > > Most AMD GPUs are not integrated endpoints. APUs are integrated. There are > A+A GPUs where the GPUs are separate from the CPU but part of the same > coherent data fabric as the CPU (adev->gmc.xbmi.connected_to_cpu == true). > Those may also be considered RCiEPs. (I'm not sure about that, is there an > easy way to check with lspci?) We may need to include that in the same > branch as APUs. Yep, for RCiEPs, "lspci -v" should say something like this: Capabilities: [64] Express Root Complex Integrated Endpoint Dmesg logs from recent kernels would also include it like this: pci 0000:00:02.0: [8086:5916] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint An RCiEP would be on the root bus; it would not be below a Root Port. > You can see that we did that for a new generation of A+A GPU here: https://gitlab.freedesktop.org/agd5f/linux/-/blob/amd-staging-drm-next/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c?ref_type=heads#L3920. > We'd need to confirm that the same works for MI200 A+A GPUs as well. > > > > These drivers use pci_enable_atomic_ops_to_root(): > > > > > > > > amdgpu > > > > bnxt_re (infiniband) > > > > mlx5 (infinband) > > > > qedr (infiniband) > > > > mlx5 (ethernet) > > > > > > > > Maybe we should assume that because RCiEPs are directly integrated > > > > into the RC, the RCiEP would only allow AtomicOp Requester Enable to > > > > be set if the RC supports atomic ops? > > > > > > > > I don't like making assumptions like that, but it'd be worse to break > > > > these devices. > > > > > > > > [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas > > > > > > > > > Signed-off-by: Gerd Bayer > > > > > --- > > > > > drivers/pci/pci.c | 5 ++--- > > > > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > > index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644 > > > > > --- a/drivers/pci/pci.c > > > > > +++ b/drivers/pci/pci.c > > > > > @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask) > > > > > /* > > > > > * Per PCIe r4.0, sec 6.15, endpoints and root ports may be > > > > > - * AtomicOp requesters. For now, we only support endpoints as > > > > > - * requesters and root ports as completers. No endpoints as > > > > > + * AtomicOp requesters. For now, we only support (legacy) endpoints > > > > > + * as requesters and root ports as completers. No endpoints as > > > > > * completers, and no peer-to-peer. > > > > > */ > > > > > switch (pci_pcie_type(dev)) { > > > > > case PCI_EXP_TYPE_ENDPOINT: > > > > > case PCI_EXP_TYPE_LEG_END: > > > > > - case PCI_EXP_TYPE_RC_END: > > > > > break; > > > > > default: > > > > > return -EINVAL; > > > > > > > > > > -- > > > > > 2.51.0 > > > > >