* Q: Usage of pci_enable_atomic_ops_to_root()
@ 2025-10-24 17:18 Gerd Bayer
2025-11-03 12:16 ` Gerd Bayer
0 siblings, 1 reply; 2+ messages in thread
From: Gerd Bayer @ 2025-10-24 17:18 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling
Cc: Niklas Schnelle, Alexander Schmidt, netdev, linux-rdma, linux-pci
Hi all,
I stumbled over mlx5's usage of pci_enable_atomic_ops_to_root() at
https://elixir.bootlin.com/linux/v6.18-rc2/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L937
and was wondering if its repeated calls with the 3 available sizes gave
it the intended result.
I assume the intent was to enable requesting AtomicOps only if all
three sizes 32/64/128-bit were supported at the root-complex. However,
pci_enable_atomic_ops_to_root() would enable the request at the PCIe
level, even if just 32-bit sized Ops was supported at the root-complex.
So I checked other users in the kernel and found an inconclusive
picture:
The AMD GPU that this was originally introduced for [0] checks for a
combination of two sizes, while a few infiniband/ethernet and the vfio-
pci driver do variations of sequential checks (potentially enabling
requests that they don't want to)
Now the PCIe Spec Rev. 7.0 has also a mixed bag. Section 6.15.3.1
mandates for Root Ports:
> If a Root Port implements any AtomicOp Completer capability for host
> memory access, it must implement all 32-bit and 64-bit AtomicOp
> Completer capabilities. Implementing 128-bit CAS Completer capability
> is optional.
While this is specific, marking the CAS Op Completions in the 128-bit
variant optional, the Capability bits just specify 128-bit AtomicOps
(all AtomicOps: FetchAdd, Swap, CAS). Strictly interpreted, this would
require root port implementors to announce all-or-nothing of 32/64/128-
bit AtomicOps - which kind of makes the size-granularity of the
capability bits useless - and leave the endpoint device (and its
driver) attempting to use 128-bit CAS in the dark...
[0]: https://lore.kernel.org/linux-pci/1515113100-4718-1-git-send-email-Felix.Kuehling@amd.com/
Can anybody shed some light on this?
Thank you,
Gerd
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Q: Usage of pci_enable_atomic_ops_to_root()
2025-10-24 17:18 Q: Usage of pci_enable_atomic_ops_to_root() Gerd Bayer
@ 2025-11-03 12:16 ` Gerd Bayer
0 siblings, 0 replies; 2+ messages in thread
From: Gerd Bayer @ 2025-11-03 12:16 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling
Cc: Niklas Schnelle, Alexander Schmidt, netdev, linux-rdma, linux-pci,
Gerd Bayer
On Fri, 2025-10-24 at 19:18 +0200, Gerd Bayer wrote:
> Hi all,
>
> I stumbled over mlx5's usage of pci_enable_atomic_ops_to_root() at
>
> https://elixir.bootlin.com/linux/v6.18-rc2/source/drivers/net/ethernet/mellanox/mlx5/core/main.c#L937
>
> and was wondering if its repeated calls with the 3 available sizes gave
> it the intended result.
>
> I assume the intent was to enable requesting AtomicOps only if all
> three sizes 32/64/128-bit were supported at the root-complex. However,
> pci_enable_atomic_ops_to_root() would enable the request at the PCIe
> level, even if just 32-bit sized Ops was supported at the root-complex.
Looks like I might just send out an RFC patch for review by the
Mellanox/NVidia folks? Not sure if I can find a test-setup for this,
though...
> So I checked other users in the kernel and found an inconclusive
> picture:
> The AMD GPU that this was originally introduced for [0] checks for a
> combination of two sizes, while a few infiniband/ethernet and the vfio-
> pci driver do variations of sequential checks (potentially enabling
> requests that they don't want to)
>
> Now the PCIe Spec Rev. 7.0 has also a mixed bag. Section 6.15.3.1
> mandates for Root Ports:
>
> > If a Root Port implements any AtomicOp Completer capability for host
> > memory access, it must implement all 32-bit and 64-bit AtomicOp
> > Completer capabilities. Implementing 128-bit CAS Completer capability
> > is optional.
>
> While this is specific, marking the CAS Op Completions in the 128-bit
> variant optional, the Capability bits just specify 128-bit AtomicOps
> (all AtomicOps: FetchAdd, Swap, CAS). Strictly interpreted, this would
> require root port implementors to announce all-or-nothing of 32/64/128-
> bit AtomicOps - which kind of makes the size-granularity of the
> capability bits useless - and leave the endpoint device (and its
> driver) attempting to use 128-bit CAS in the dark...
I guess I need to ask the folks at PCI SIG?
> [0]: https://lore.kernel.org/linux-pci/1515113100-4718-1-git-send-email-Felix.Kuehling@amd.com/
>
> Can anybody shed some light on this?
> Thank you,
> Gerd
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-11-03 12:16 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-24 17:18 Q: Usage of pci_enable_atomic_ops_to_root() Gerd Bayer
2025-11-03 12:16 ` Gerd Bayer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).