* [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-30 13:09 [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
@ 2026-03-30 13:09 ` Gerd Bayer
2026-03-30 21:42 ` Bjorn Helgaas
2026-03-30 13:09 ` [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port Gerd Bayer
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Gerd Bayer @ 2026-03-30 13:09 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle
Cc: Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Sven Schnelle, Leon Romanovsky, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer
Since root complex integrated end points (RCiEPs) attach to a bus that
has no bridge device describing the root port, the capability to
complete AtomicOps requests cannot be determined with PCIe methods.
Change default of pci_enable_atomic_ops_to_root() to not enable
AtomicOps requests on RCiEPs.
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
drivers/pci/pci.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
/*
* Per PCIe r4.0, sec 6.15, endpoints and root ports may be
- * AtomicOp requesters. For now, we only support endpoints as
- * requesters and root ports as completers. No endpoints as
+ * AtomicOp requesters. For now, we only support (legacy) endpoints
+ * as requesters and root ports as completers. No endpoints as
* completers, and no peer-to-peer.
*/
switch (pci_pcie_type(dev)) {
case PCI_EXP_TYPE_ENDPOINT:
case PCI_EXP_TYPE_LEG_END:
- case PCI_EXP_TYPE_RC_END:
break;
default:
return -EINVAL;
--
2.51.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-30 13:09 ` [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs Gerd Bayer
@ 2026-03-30 21:42 ` Bjorn Helgaas
2026-03-30 22:26 ` Jason Gunthorpe
2026-03-31 0:01 ` Kuehling, Felix
0 siblings, 2 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2026-03-30 21:42 UTC (permalink / raw)
To: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Jason Gunthorpe, Leon Romanovsky, Michal Kalderon,
Saeed Mahameed, Tariq Toukan, Mark Bloch
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Alexander Schmidt, linux-s390, linux-pci, linux-kernel, netdev,
linux-rdma
[+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
> Since root complex integrated end points (RCiEPs) attach to a bus that
> has no bridge device describing the root port, the capability to
> complete AtomicOps requests cannot be determined with PCIe methods.
>
> Change default of pci_enable_atomic_ops_to_root() to not enable
> AtomicOps requests on RCiEPs.
I know I suggested this because there's nothing explicit that tells us
whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
that GPUs, infiniband HCAs, and NICs that use atomic ops may be
implemented as RCiEPs and would be broken by this.
These drivers use pci_enable_atomic_ops_to_root():
amdgpu
bnxt_re (infiniband)
mlx5 (infinband)
qedr (infiniband)
mlx5 (ethernet)
Maybe we should assume that because RCiEPs are directly integrated
into the RC, the RCiEP would only allow AtomicOp Requester Enable to
be set if the RC supports atomic ops?
I don't like making assumptions like that, but it'd be worse to break
these devices.
[1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> drivers/pci/pci.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
>
> /*
> * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
> - * AtomicOp requesters. For now, we only support endpoints as
> - * requesters and root ports as completers. No endpoints as
> + * AtomicOp requesters. For now, we only support (legacy) endpoints
> + * as requesters and root ports as completers. No endpoints as
> * completers, and no peer-to-peer.
> */
>
> switch (pci_pcie_type(dev)) {
> case PCI_EXP_TYPE_ENDPOINT:
> case PCI_EXP_TYPE_LEG_END:
> - case PCI_EXP_TYPE_RC_END:
> break;
> default:
> return -EINVAL;
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-30 21:42 ` Bjorn Helgaas
@ 2026-03-30 22:26 ` Jason Gunthorpe
2026-03-31 0:01 ` Kuehling, Felix
1 sibling, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2026-03-30 22:26 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Leon Romanovsky, Michal Kalderon, Saeed Mahameed,
Tariq Toukan, Mark Bloch, Bjorn Helgaas, Jay Cornwall,
Felix Kuehling, Ilpo Järvinen, Christian Borntraeger,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma
On Mon, Mar 30, 2026 at 04:42:53PM -0500, Bjorn Helgaas wrote:
> [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
>
> On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
> > Since root complex integrated end points (RCiEPs) attach to a bus that
> > has no bridge device describing the root port, the capability to
> > complete AtomicOps requests cannot be determined with PCIe methods.
> >
> > Change default of pci_enable_atomic_ops_to_root() to not enable
> > AtomicOps requests on RCiEPs.
>
> I know I suggested this because there's nothing explicit that tells us
> whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
> that GPUs, infiniband HCAs, and NICs that use atomic ops may be
> implemented as RCiEPs and would be broken by this.
AFAIK none of the NICs are integrated into root complexes in their
topology model..
Jason
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-30 21:42 ` Bjorn Helgaas
2026-03-30 22:26 ` Jason Gunthorpe
@ 2026-03-31 0:01 ` Kuehling, Felix
2026-03-31 18:09 ` Bjorn Helgaas
1 sibling, 1 reply; 14+ messages in thread
From: Kuehling, Felix @ 2026-03-31 0:01 UTC (permalink / raw)
To: Bjorn Helgaas, Gerd Bayer, Alex Deucher, Christian König,
Selvin Xavier, Kalesh AP, Jason Gunthorpe, Leon Romanovsky,
Michal Kalderon, Saeed Mahameed, Tariq Toukan, Mark Bloch
Cc: Bjorn Helgaas, Jay Cornwall, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Alexander Schmidt, linux-s390, linux-pci, linux-kernel, netdev,
linux-rdma
On 2026-03-30 17:42, Bjorn Helgaas wrote:
> [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
>
> On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
>> Since root complex integrated end points (RCiEPs) attach to a bus that
>> has no bridge device describing the root port, the capability to
>> complete AtomicOps requests cannot be determined with PCIe methods.
>>
>> Change default of pci_enable_atomic_ops_to_root() to not enable
>> AtomicOps requests on RCiEPs.
> I know I suggested this because there's nothing explicit that tells us
> whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
> that GPUs, infiniband HCAs, and NICs that use atomic ops may be
> implemented as RCiEPs and would be broken by this.
FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root.
It just assumes that the GPU can do atomic accesses because it doesn't
actually go through PCIe:
https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785
Regards,
Felix
>
> These drivers use pci_enable_atomic_ops_to_root():
>
> amdgpu
> bnxt_re (infiniband)
> mlx5 (infinband)
> qedr (infiniband)
> mlx5 (ethernet)
>
> Maybe we should assume that because RCiEPs are directly integrated
> into the RC, the RCiEP would only allow AtomicOp Requester Enable to
> be set if the RC supports atomic ops?
>
> I don't like making assumptions like that, but it'd be worse to break
> these devices.
>
> [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
>
>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>> ---
>> drivers/pci/pci.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
>>
>> /*
>> * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
>> - * AtomicOp requesters. For now, we only support endpoints as
>> - * requesters and root ports as completers. No endpoints as
>> + * AtomicOp requesters. For now, we only support (legacy) endpoints
>> + * as requesters and root ports as completers. No endpoints as
>> * completers, and no peer-to-peer.
>> */
>>
>> switch (pci_pcie_type(dev)) {
>> case PCI_EXP_TYPE_ENDPOINT:
>> case PCI_EXP_TYPE_LEG_END:
>> - case PCI_EXP_TYPE_RC_END:
>> break;
>> default:
>> return -EINVAL;
>>
>> --
>> 2.51.0
>>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-31 0:01 ` Kuehling, Felix
@ 2026-03-31 18:09 ` Bjorn Helgaas
2026-03-31 18:39 ` Kuehling, Felix
0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2026-03-31 18:09 UTC (permalink / raw)
To: Kuehling, Felix
Cc: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Jason Gunthorpe, Leon Romanovsky, Michal Kalderon,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Bjorn Helgaas,
Jay Cornwall, Ilpo Järvinen, Christian Borntraeger,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma
On Mon, Mar 30, 2026 at 08:01:57PM -0400, Kuehling, Felix wrote:
> On 2026-03-30 17:42, Bjorn Helgaas wrote:
> > [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
> >
> > On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
> > > Since root complex integrated end points (RCiEPs) attach to a bus that
> > > has no bridge device describing the root port, the capability to
> > > complete AtomicOps requests cannot be determined with PCIe methods.
> > >
> > > Change default of pci_enable_atomic_ops_to_root() to not enable
> > > AtomicOps requests on RCiEPs.
> > I know I suggested this because there's nothing explicit that tells us
> > whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
> > that GPUs, infiniband HCAs, and NICs that use atomic ops may be
> > implemented as RCiEPs and would be broken by this.
>
> FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root. It
> just assumes that the GPU can do atomic accesses because it doesn't actually
> go through PCIe: https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785
What does this mean for the other branch that *does* use
pci_enable_atomic_ops_to_root()? Can any of those devices be RCiEPs?
> > These drivers use pci_enable_atomic_ops_to_root():
> >
> > amdgpu
> > bnxt_re (infiniband)
> > mlx5 (infinband)
> > qedr (infiniband)
> > mlx5 (ethernet)
> >
> > Maybe we should assume that because RCiEPs are directly integrated
> > into the RC, the RCiEP would only allow AtomicOp Requester Enable to
> > be set if the RC supports atomic ops?
> >
> > I don't like making assumptions like that, but it'd be worse to break
> > these devices.
> >
> > [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
> >
> > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > > ---
> > > drivers/pci/pci.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> > > /*
> > > * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
> > > - * AtomicOp requesters. For now, we only support endpoints as
> > > - * requesters and root ports as completers. No endpoints as
> > > + * AtomicOp requesters. For now, we only support (legacy) endpoints
> > > + * as requesters and root ports as completers. No endpoints as
> > > * completers, and no peer-to-peer.
> > > */
> > > switch (pci_pcie_type(dev)) {
> > > case PCI_EXP_TYPE_ENDPOINT:
> > > case PCI_EXP_TYPE_LEG_END:
> > > - case PCI_EXP_TYPE_RC_END:
> > > break;
> > > default:
> > > return -EINVAL;
> > >
> > > --
> > > 2.51.0
> > >
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-31 18:09 ` Bjorn Helgaas
@ 2026-03-31 18:39 ` Kuehling, Felix
2026-03-31 19:01 ` Bjorn Helgaas
0 siblings, 1 reply; 14+ messages in thread
From: Kuehling, Felix @ 2026-03-31 18:39 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Jason Gunthorpe, Leon Romanovsky, Michal Kalderon,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Bjorn Helgaas,
Jay Cornwall, Ilpo Järvinen, Christian Borntraeger,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma
On 2026-03-31 14:09, Bjorn Helgaas wrote:
> On Mon, Mar 30, 2026 at 08:01:57PM -0400, Kuehling, Felix wrote:
>> On 2026-03-30 17:42, Bjorn Helgaas wrote:
>>> [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
>>>
>>> On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
>>>> Since root complex integrated end points (RCiEPs) attach to a bus that
>>>> has no bridge device describing the root port, the capability to
>>>> complete AtomicOps requests cannot be determined with PCIe methods.
>>>>
>>>> Change default of pci_enable_atomic_ops_to_root() to not enable
>>>> AtomicOps requests on RCiEPs.
>>> I know I suggested this because there's nothing explicit that tells us
>>> whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
>>> that GPUs, infiniband HCAs, and NICs that use atomic ops may be
>>> implemented as RCiEPs and would be broken by this.
>> FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root. It
>> just assumes that the GPU can do atomic accesses because it doesn't actually
>> go through PCIe: https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785
> What does this mean for the other branch that *does* use
> pci_enable_atomic_ops_to_root()? Can any of those devices be RCiEPs?
Most AMD GPUs are not integrated endpoints. APUs are integrated. There
are A+A GPUs where the GPUs are separate from the CPU but part of the
same coherent data fabric as the CPU (adev->gmc.xbmi.connected_to_cpu ==
true). Those may also be considered RCiEPs. (I'm not sure about that, is
there an easy way to check with lspci?) We may need to include that in
the same branch as APUs.
You can see that we did that for a new generation of A+A GPU here:
https://gitlab.freedesktop.org/agd5f/linux/-/blob/amd-staging-drm-next/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c?ref_type=heads#L3920.
We'd need to confirm that the same works for MI200 A+A GPUs as well.
Regards,
Felix
>
>>> These drivers use pci_enable_atomic_ops_to_root():
>>>
>>> amdgpu
>>> bnxt_re (infiniband)
>>> mlx5 (infinband)
>>> qedr (infiniband)
>>> mlx5 (ethernet)
>>>
>>> Maybe we should assume that because RCiEPs are directly integrated
>>> into the RC, the RCiEP would only allow AtomicOp Requester Enable to
>>> be set if the RC supports atomic ops?
>>>
>>> I don't like making assumptions like that, but it'd be worse to break
>>> these devices.
>>>
>>> [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
>>>
>>>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>>>> ---
>>>> drivers/pci/pci.c | 5 ++---
>>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
>>>> /*
>>>> * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
>>>> - * AtomicOp requesters. For now, we only support endpoints as
>>>> - * requesters and root ports as completers. No endpoints as
>>>> + * AtomicOp requesters. For now, we only support (legacy) endpoints
>>>> + * as requesters and root ports as completers. No endpoints as
>>>> * completers, and no peer-to-peer.
>>>> */
>>>> switch (pci_pcie_type(dev)) {
>>>> case PCI_EXP_TYPE_ENDPOINT:
>>>> case PCI_EXP_TYPE_LEG_END:
>>>> - case PCI_EXP_TYPE_RC_END:
>>>> break;
>>>> default:
>>>> return -EINVAL;
>>>>
>>>> --
>>>> 2.51.0
>>>>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-31 18:39 ` Kuehling, Felix
@ 2026-03-31 19:01 ` Bjorn Helgaas
2026-03-31 20:12 ` Kuehling, Felix
0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2026-03-31 19:01 UTC (permalink / raw)
To: Kuehling, Felix
Cc: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Jason Gunthorpe, Leon Romanovsky, Michal Kalderon,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Bjorn Helgaas,
Jay Cornwall, Ilpo Järvinen, Christian Borntraeger,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma
On Tue, Mar 31, 2026 at 02:39:26PM -0400, Kuehling, Felix wrote:
> On 2026-03-31 14:09, Bjorn Helgaas wrote:
> > On Mon, Mar 30, 2026 at 08:01:57PM -0400, Kuehling, Felix wrote:
> > > On 2026-03-30 17:42, Bjorn Helgaas wrote:
> > > > [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
> > > >
> > > > On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
> > > > > Since root complex integrated end points (RCiEPs) attach to a bus that
> > > > > has no bridge device describing the root port, the capability to
> > > > > complete AtomicOps requests cannot be determined with PCIe methods.
> > > > >
> > > > > Change default of pci_enable_atomic_ops_to_root() to not enable
> > > > > AtomicOps requests on RCiEPs.
> > > > I know I suggested this because there's nothing explicit that tells us
> > > > whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
> > > > that GPUs, infiniband HCAs, and NICs that use atomic ops may be
> > > > implemented as RCiEPs and would be broken by this.
> > > FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root. It
> > > just assumes that the GPU can do atomic accesses because it doesn't actually
> > > go through PCIe: https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785
> > What does this mean for the other branch that *does* use
> > pci_enable_atomic_ops_to_root()? Can any of those devices be RCiEPs?
>
> Most AMD GPUs are not integrated endpoints. APUs are integrated. There are
> A+A GPUs where the GPUs are separate from the CPU but part of the same
> coherent data fabric as the CPU (adev->gmc.xbmi.connected_to_cpu == true).
> Those may also be considered RCiEPs. (I'm not sure about that, is there an
> easy way to check with lspci?) We may need to include that in the same
> branch as APUs.
Yep, for RCiEPs, "lspci -v" should say something like this:
Capabilities: [64] Express Root Complex Integrated Endpoint
Dmesg logs from recent kernels would also include it like this:
pci 0000:00:02.0: [8086:5916] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
An RCiEP would be on the root bus; it would not be below a Root Port.
> You can see that we did that for a new generation of A+A GPU here: https://gitlab.freedesktop.org/agd5f/linux/-/blob/amd-staging-drm-next/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c?ref_type=heads#L3920.
> We'd need to confirm that the same works for MI200 A+A GPUs as well.
> > > > These drivers use pci_enable_atomic_ops_to_root():
> > > >
> > > > amdgpu
> > > > bnxt_re (infiniband)
> > > > mlx5 (infinband)
> > > > qedr (infiniband)
> > > > mlx5 (ethernet)
> > > >
> > > > Maybe we should assume that because RCiEPs are directly integrated
> > > > into the RC, the RCiEP would only allow AtomicOp Requester Enable to
> > > > be set if the RC supports atomic ops?
> > > >
> > > > I don't like making assumptions like that, but it'd be worse to break
> > > > these devices.
> > > >
> > > > [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
> > > >
> > > > > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > > > > ---
> > > > > drivers/pci/pci.c | 5 ++---
> > > > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > > index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
> > > > > --- a/drivers/pci/pci.c
> > > > > +++ b/drivers/pci/pci.c
> > > > > @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> > > > > /*
> > > > > * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
> > > > > - * AtomicOp requesters. For now, we only support endpoints as
> > > > > - * requesters and root ports as completers. No endpoints as
> > > > > + * AtomicOp requesters. For now, we only support (legacy) endpoints
> > > > > + * as requesters and root ports as completers. No endpoints as
> > > > > * completers, and no peer-to-peer.
> > > > > */
> > > > > switch (pci_pcie_type(dev)) {
> > > > > case PCI_EXP_TYPE_ENDPOINT:
> > > > > case PCI_EXP_TYPE_LEG_END:
> > > > > - case PCI_EXP_TYPE_RC_END:
> > > > > break;
> > > > > default:
> > > > > return -EINVAL;
> > > > >
> > > > > --
> > > > > 2.51.0
> > > > >
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs
2026-03-31 19:01 ` Bjorn Helgaas
@ 2026-03-31 20:12 ` Kuehling, Felix
0 siblings, 0 replies; 14+ messages in thread
From: Kuehling, Felix @ 2026-03-31 20:12 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Gerd Bayer, Alex Deucher, Christian König, Selvin Xavier,
Kalesh AP, Jason Gunthorpe, Leon Romanovsky, Michal Kalderon,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Bjorn Helgaas,
Jay Cornwall, Ilpo Järvinen, Christian Borntraeger,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma
On 2026-03-31 15:01, Bjorn Helgaas wrote:
> On Tue, Mar 31, 2026 at 02:39:26PM -0400, Kuehling, Felix wrote:
>> On 2026-03-31 14:09, Bjorn Helgaas wrote:
>>> On Mon, Mar 30, 2026 at 08:01:57PM -0400, Kuehling, Felix wrote:
>>>> On 2026-03-30 17:42, Bjorn Helgaas wrote:
>>>>> [+to amdgpu, bnxe_re, mlx5 IB, qedr, mlx5 maintainers]
>>>>>
>>>>> On Mon, Mar 30, 2026 at 03:09:44PM +0200, Gerd Bayer wrote:
>>>>>> Since root complex integrated end points (RCiEPs) attach to a bus that
>>>>>> has no bridge device describing the root port, the capability to
>>>>>> complete AtomicOps requests cannot be determined with PCIe methods.
>>>>>>
>>>>>> Change default of pci_enable_atomic_ops_to_root() to not enable
>>>>>> AtomicOps requests on RCiEPs.
>>>>> I know I suggested this because there's nothing explicit that tells us
>>>>> whether the RC supports atomic ops from RCiEPs [1]. But I'm concerned
>>>>> that GPUs, infiniband HCAs, and NICs that use atomic ops may be
>>>>> implemented as RCiEPs and would be broken by this.
>>>> FWIW, on AMD APUs our driver doesn't call pci_enable_atomic_ops_to_root. It
>>>> just assumes that the GPU can do atomic accesses because it doesn't actually
>>>> go through PCIe: https://elixir.bootlin.com/linux/v6.19.10/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4785
>>> What does this mean for the other branch that *does* use
>>> pci_enable_atomic_ops_to_root()? Can any of those devices be RCiEPs?
>> Most AMD GPUs are not integrated endpoints. APUs are integrated. There are
>> A+A GPUs where the GPUs are separate from the CPU but part of the same
>> coherent data fabric as the CPU (adev->gmc.xbmi.connected_to_cpu == true).
>> Those may also be considered RCiEPs. (I'm not sure about that, is there an
>> easy way to check with lspci?) We may need to include that in the same
>> branch as APUs.
> Yep, for RCiEPs, "lspci -v" should say something like this:
>
> Capabilities: [64] Express Root Complex Integrated Endpoint
>
> Dmesg logs from recent kernels would also include it like this:
>
> pci 0000:00:02.0: [8086:5916] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
>
> An RCiEP would be on the root bus; it would not be below a Root Port.
I'm getting this from lspci:
Capabilities: [64] Express Endpoint, MSI 00
Regards,
Felix
>
>> You can see that we did that for a new generation of A+A GPU here: https://gitlab.freedesktop.org/agd5f/linux/-/blob/amd-staging-drm-next/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c?ref_type=heads#L3920.
>> We'd need to confirm that the same works for MI200 A+A GPUs as well.
>>>>> These drivers use pci_enable_atomic_ops_to_root():
>>>>>
>>>>> amdgpu
>>>>> bnxt_re (infiniband)
>>>>> mlx5 (infinband)
>>>>> qedr (infiniband)
>>>>> mlx5 (ethernet)
>>>>>
>>>>> Maybe we should assume that because RCiEPs are directly integrated
>>>>> into the RC, the RCiEP would only allow AtomicOp Requester Enable to
>>>>> be set if the RC supports atomic ops?
>>>>>
>>>>> I don't like making assumptions like that, but it'd be worse to break
>>>>> these devices.
>>>>>
>>>>> [1] https://lore.kernel.org/all/20260326164002.GA1325368@bhelgaas
>>>>>
>>>>>> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>>>>>> ---
>>>>>> drivers/pci/pci.c | 5 ++---
>>>>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>>> index 8479c2e1f74f1044416281aba11bf071ea89488a..135e5b591df405e87e7f520a618d7e2ccba55ce1 100644
>>>>>> --- a/drivers/pci/pci.c
>>>>>> +++ b/drivers/pci/pci.c
>>>>>> @@ -3692,15 +3692,14 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
>>>>>> /*
>>>>>> * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
>>>>>> - * AtomicOp requesters. For now, we only support endpoints as
>>>>>> - * requesters and root ports as completers. No endpoints as
>>>>>> + * AtomicOp requesters. For now, we only support (legacy) endpoints
>>>>>> + * as requesters and root ports as completers. No endpoints as
>>>>>> * completers, and no peer-to-peer.
>>>>>> */
>>>>>> switch (pci_pcie_type(dev)) {
>>>>>> case PCI_EXP_TYPE_ENDPOINT:
>>>>>> case PCI_EXP_TYPE_LEG_END:
>>>>>> - case PCI_EXP_TYPE_RC_END:
>>>>>> break;
>>>>>> default:
>>>>>> return -EINVAL;
>>>>>>
>>>>>> --
>>>>>> 2.51.0
>>>>>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port
2026-03-30 13:09 [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
2026-03-30 13:09 ` [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs Gerd Bayer
@ 2026-03-30 13:09 ` Gerd Bayer
2026-04-01 17:27 ` Bjorn Helgaas
2026-03-30 13:09 ` [PATCH v7 3/3] PCI: AtomicOps: Update references to PCIe spec Gerd Bayer
2026-04-02 16:38 ` [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Bjorn Helgaas
3 siblings, 1 reply; 14+ messages in thread
From: Gerd Bayer @ 2026-03-30 13:09 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle
Cc: Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Sven Schnelle, Leon Romanovsky, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer, stable
When inspecting the config space of a Connect-X physical function in an
s390 system after it was initialized by the mlx5_core device driver, we
found the function to be enabled to request AtomicOps despite the
system's root-complex lacking support for completing them:
1ed0:00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
Subsystem: Mellanox Technologies Device 0002
[...]
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
AtomicOpsCtl: ReqEn+
IDOReq- IDOCompl- LTR- EmergencyPowerReductionReq-
10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
Turns out the device driver calls pci_enable_atomic_ops_to_root() which
defaulted to enable AtomicOps requests even if it had no information
about the root port that the PCIe device is attached to.
Change the logic of pci_enable_atomic_ops_to_root() to fully traverse the
PCIe tree upwards, check that the bridge devices support delivering
AtomicOps transactions, and finally check that there is a root port at
the end that does support completing AtomicOps.
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
drivers/pci/pci.c | 39 ++++++++++++++++++++++-----------------
1 file changed, 22 insertions(+), 17 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 135e5b591df405e87e7f520a618d7e2ccba55ce1..57af00ecdc97086a32c063ff86f8a39087ad1f5e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3660,6 +3660,14 @@ void pci_acs_init(struct pci_dev *dev)
pci_disable_broken_acs_cap(dev);
}
+static bool pci_is_atomicops_capable_rp(struct pci_dev *dev, u32 cap, u32 cap_mask)
+{
+ if (!dev || !(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT))
+ return false;
+
+ return (cap & cap_mask) == cap_mask;
+}
+
/**
* pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
* @dev: the PCI device
@@ -3676,8 +3684,9 @@ void pci_acs_init(struct pci_dev *dev)
int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
{
struct pci_bus *bus = dev->bus;
- struct pci_dev *bridge;
- u32 cap, ctl2;
+ struct pci_dev *bridge = NULL;
+ u32 cap = 0;
+ u32 ctl2;
/*
* Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit
@@ -3713,29 +3722,25 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
switch (pci_pcie_type(bridge)) {
/* Ensure switch ports support AtomicOp routing */
case PCI_EXP_TYPE_UPSTREAM:
- case PCI_EXP_TYPE_DOWNSTREAM:
- if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
- return -EINVAL;
- break;
-
- /* Ensure root port supports all the sizes we care about */
- case PCI_EXP_TYPE_ROOT_PORT:
- if ((cap & cap_mask) != cap_mask)
- return -EINVAL;
- break;
- }
-
- /* Ensure upstream ports don't block AtomicOps on egress */
- if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
+ /* Upstream ports must not block AtomicOps on egress */
pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
&ctl2);
if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
return -EINVAL;
+ fallthrough;
+ /* All switch ports need to route AtomicOps */
+ case PCI_EXP_TYPE_DOWNSTREAM:
+ if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+ return -EINVAL;
+ break;
}
-
bus = bus->parent;
}
+ /* Finally, last bridge must be root port and support requested sizes */
+ if (!(pci_is_atomicops_capable_rp(bridge, cap, cap_mask)))
+ return -EINVAL;
+
pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
PCI_EXP_DEVCTL2_ATOMIC_REQ);
return 0;
--
2.51.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port
2026-03-30 13:09 ` [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port Gerd Bayer
@ 2026-04-01 17:27 ` Bjorn Helgaas
2026-04-02 14:44 ` Gerd Bayer
0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2026-04-01 17:27 UTC (permalink / raw)
To: Gerd Bayer
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Leon Romanovsky, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, stable
On Mon, Mar 30, 2026 at 03:09:45PM +0200, Gerd Bayer wrote:
> When inspecting the config space of a Connect-X physical function in an
> s390 system after it was initialized by the mlx5_core device driver, we
> found the function to be enabled to request AtomicOps despite the
> system's root-complex lacking support for completing them:
>
> 1ed0:00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
> Subsystem: Mellanox Technologies Device 0002
> [...]
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> AtomicOpsCtl: ReqEn+
> IDOReq- IDOCompl- LTR- EmergencyPowerReductionReq-
> 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
>
> Turns out the device driver calls pci_enable_atomic_ops_to_root() which
> defaulted to enable AtomicOps requests even if it had no information
> about the root port that the PCIe device is attached to.
>
> Change the logic of pci_enable_atomic_ops_to_root() to fully traverse the
> PCIe tree upwards, check that the bridge devices support delivering
> AtomicOps transactions, and finally check that there is a root port at
> the end that does support completing AtomicOps.
>
> Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
OK, I think this is set to go. It sounds like there are no RCiEPs
that we need to worry about.
I think pci_enable_atomic_ops_to_root() will end up more readable if
we check for the Root Port first and explicitly as in the modified
version. I *think* it's equivalent but can't easily test it. What do
you think?
commit 2f3f32f2c180 ("PCI: Enable AtomicOps only if Root Port supports them")
Author: Gerd Bayer <gbayer@linux.ibm.com>
Date: Mon Mar 30 15:09:45 2026 +0200
PCI: Enable AtomicOps only if Root Port supports them
When inspecting the config space of a Connect-X physical function in an
s390 system after it was initialized by the mlx5_core device driver, we
found the function to be enabled to request AtomicOps despite the Root Port
lacking support for completing them:
00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
Subsystem: Mellanox Technologies Device 0002
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
AtomicOpsCtl: ReqEn+
On s390 and many virtualized guests, the Endpoint is visible but the Root
Port is not. In this case, pci_enable_atomic_ops_to_root() previously
enabled AtomicOps in the Endpoint even though it couldn't tell whether
the Root Port supports them as a completer.
Change pci_enable_atomic_ops_to_root() to fail if there's no Root Port or
the Root Port doesn't support AtomicOps.
Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 135e5b591df4..515f565a4a70 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3675,8 +3675,7 @@ void pci_acs_init(struct pci_dev *dev)
*/
int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
{
- struct pci_bus *bus = dev->bus;
- struct pci_dev *bridge;
+ struct pci_dev *root, *bridge;
u32 cap, ctl2;
/*
@@ -3705,35 +3704,35 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
return -EINVAL;
}
- while (bus->parent) {
- bridge = bus->self;
+ root = pcie_find_root_port(dev);
+ if (!root)
+ return -EINVAL;
- pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
+ pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
+ if ((cap & cap_mask) != cap_mask)
+ return -EINVAL;
+ bridge = pci_upstream_bridge(dev);
+ while (bridge != root) {
switch (pci_pcie_type(bridge)) {
- /* Ensure switch ports support AtomicOp routing */
case PCI_EXP_TYPE_UPSTREAM:
- case PCI_EXP_TYPE_DOWNSTREAM:
- if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
- return -EINVAL;
- break;
-
- /* Ensure root port supports all the sizes we care about */
- case PCI_EXP_TYPE_ROOT_PORT:
- if ((cap & cap_mask) != cap_mask)
- return -EINVAL;
- break;
- }
-
- /* Ensure upstream ports don't block AtomicOps on egress */
- if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
+ /* Upstream ports must not block AtomicOps on egress */
pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
&ctl2);
if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
return -EINVAL;
+ fallthrough;
+
+ /* All switch ports need to route AtomicOps */
+ case PCI_EXP_TYPE_DOWNSTREAM:
+ pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2,
+ &cap);
+ if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+ return -EINVAL;
+ break;
}
- bus = bus->parent;
+ bridge = pci_upstream_bridge(bridge);
}
pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port
2026-04-01 17:27 ` Bjorn Helgaas
@ 2026-04-02 14:44 ` Gerd Bayer
0 siblings, 0 replies; 14+ messages in thread
From: Gerd Bayer @ 2026-04-02 14:44 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Leon Romanovsky, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, stable, Gerd Bayer
On Wed, 2026-04-01 at 12:27 -0500, Bjorn Helgaas wrote:
> On Mon, Mar 30, 2026 at 03:09:45PM +0200, Gerd Bayer wrote:
> > When inspecting the config space of a Connect-X physical function in an
> > s390 system after it was initialized by the mlx5_core device driver, we
> > found the function to be enabled to request AtomicOps despite the
> > system's root-complex lacking support for completing them:
> >
> > 1ed0:00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
> > Subsystem: Mellanox Technologies Device 0002
> > [...]
> > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> > AtomicOpsCtl: ReqEn+
> > IDOReq- IDOCompl- LTR- EmergencyPowerReductionReq-
> > 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
> >
> > Turns out the device driver calls pci_enable_atomic_ops_to_root() which
> > defaulted to enable AtomicOps requests even if it had no information
> > about the root port that the PCIe device is attached to.
> >
> > Change the logic of pci_enable_atomic_ops_to_root() to fully traverse the
> > PCIe tree upwards, check that the bridge devices support delivering
> > AtomicOps transactions, and finally check that there is a root port at
> > the end that does support completing AtomicOps.
> >
> > Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
> > Cc: stable@vger.kernel.org
> > Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
>
> OK, I think this is set to go. It sounds like there are no RCiEPs
> that we need to worry about.
>
> I think pci_enable_atomic_ops_to_root() will end up more readable if
> we check for the Root Port first and explicitly as in the modified
> version. I *think* it's equivalent but can't easily test it. What do
> you think?
At first sight it appears counter-intuitive to test the root-port's
capabilities before traversing the hierarchy - but with the explicit
read of the root port's DEVCAP2, we avoid the dependency to work on the
cap read within the while-loop.
My testing is somewhat limited, too - but I've verified that the
results with your patch (+ a small nit - see below) are the same as
with my version:
- ConnectX-5 Ex on s390: AtomicsOpsCtl: ReqEn-
- ConnectX-6 Dc on x86_64: AtomicsOpsCtl: ReqEn+
>
> commit 2f3f32f2c180 ("PCI: Enable AtomicOps only if Root Port supports them")
> Author: Gerd Bayer <gbayer@linux.ibm.com>
> Date: Mon Mar 30 15:09:45 2026 +0200
>
> PCI: Enable AtomicOps only if Root Port supports them
>
> When inspecting the config space of a Connect-X physical function in an
> s390 system after it was initialized by the mlx5_core device driver, we
> found the function to be enabled to request AtomicOps despite the Root Port
> lacking support for completing them:
>
> 00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
> Subsystem: Mellanox Technologies Device 0002
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> AtomicOpsCtl: ReqEn+
>
> On s390 and many virtualized guests, the Endpoint is visible but the Root
> Port is not. In this case, pci_enable_atomic_ops_to_root() previously
> enabled AtomicOps in the Endpoint even though it couldn't tell whether
> the Root Port supports them as a completer.
>
> Change pci_enable_atomic_ops_to_root() to fail if there's no Root Port or
> the Root Port doesn't support AtomicOps.
>
> Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
> Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Cc: stable@vger.kernel.org
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 135e5b591df4..515f565a4a70 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3675,8 +3675,7 @@ void pci_acs_init(struct pci_dev *dev)
> */
> int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> {
> - struct pci_bus *bus = dev->bus;
> - struct pci_dev *bridge;
> + struct pci_dev *root, *bridge;
> u32 cap, ctl2;
>
> /*
> @@ -3705,35 +3704,35 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> return -EINVAL;
> }
>
> - while (bus->parent) {
> - bridge = bus->self;
> + root = pcie_find_root_port(dev);
> + if (!root)
> + return -EINVAL;
>
> - pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
You want to read DEVCAP2 on root here, bridge is still unitialized.
> + if ((cap & cap_mask) != cap_mask)
> + return -EINVAL;
>
> + bridge = pci_upstream_bridge(dev);
> + while (bridge != root) {
> switch (pci_pcie_type(bridge)) {
> - /* Ensure switch ports support AtomicOp routing */
> case PCI_EXP_TYPE_UPSTREAM:
> - case PCI_EXP_TYPE_DOWNSTREAM:
> - if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> - return -EINVAL;
> - break;
> -
> - /* Ensure root port supports all the sizes we care about */
> - case PCI_EXP_TYPE_ROOT_PORT:
> - if ((cap & cap_mask) != cap_mask)
> - return -EINVAL;
> - break;
> - }
> -
> - /* Ensure upstream ports don't block AtomicOps on egress */
> - if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
> + /* Upstream ports must not block AtomicOps on egress */
> pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
> &ctl2);
> if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
> return -EINVAL;
> + fallthrough;
> +
> + /* All switch ports need to route AtomicOps */
> + case PCI_EXP_TYPE_DOWNSTREAM:
> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2,
> + &cap);
> + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> + return -EINVAL;
> + break;
> }
>
> - bus = bus->parent;
> + bridge = pci_upstream_bridge(bridge);
> }
>
> pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
Thanks,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v7 3/3] PCI: AtomicOps: Update references to PCIe spec
2026-03-30 13:09 [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
2026-03-30 13:09 ` [PATCH v7 1/3] PCI: AtomicOps: Do not enable requests by RCiEPs Gerd Bayer
2026-03-30 13:09 ` [PATCH v7 2/3] PCI: AtomicOps: Do not enable without support in root port Gerd Bayer
@ 2026-03-30 13:09 ` Gerd Bayer
2026-04-02 16:38 ` [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Bjorn Helgaas
3 siblings, 0 replies; 14+ messages in thread
From: Gerd Bayer @ 2026-03-30 13:09 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle
Cc: Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Sven Schnelle, Leon Romanovsky, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer
Point to the relevant sections in the most recent release 7.0 of the
PCIe spec. Text has mostly just moved around without any semantic
change.
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
drivers/pci/pci.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 57af00ecdc97086a32c063ff86f8a39087ad1f5e..b99ab47678b006004af6cdb9b0e9f9ca4a28b6e1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3689,7 +3689,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
u32 ctl2;
/*
- * Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit
+ * Per PCIe r7.0, sec 7.5.3.16, the AtomicOp Requester Enable bit
* in Device Control 2 is reserved in VFs and the PF value applies
* to all associated VFs.
*/
@@ -3700,7 +3700,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
return -EINVAL;
/*
- * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
+ * Per PCIe r7.0, sec 6.15, endpoints and root ports may be
* AtomicOp requesters. For now, we only support (legacy) endpoints
* as requesters and root ports as completers. No endpoints as
* completers, and no peer-to-peer.
--
2.51.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root()
2026-03-30 13:09 [PATCH v7 0/3] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
` (2 preceding siblings ...)
2026-03-30 13:09 ` [PATCH v7 3/3] PCI: AtomicOps: Update references to PCIe spec Gerd Bayer
@ 2026-04-02 16:38 ` Bjorn Helgaas
3 siblings, 0 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2026-04-02 16:38 UTC (permalink / raw)
To: Gerd Bayer
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Ilpo Järvinen,
Christian Borntraeger, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Leon Romanovsky, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, stable
On Mon, Mar 30, 2026 at 03:09:43PM +0200, Gerd Bayer wrote:
> Hi Bjorn et al.
>
> On s390, AtomicOp Requests are enabled on a PCI function that supports
> them, despite the helper being ignorant about the root port's capability
> to supporting their completion.
>
> Patch 1: Do not enable AtomicOps Requests on RCiEPs
> Patch 2: Fix the logic in pci_enable_atomic_ops_to_root()
> Patch 3: Update references to PCIe spec in that function.
>
> I did test that the issue is fixed with these patches. Also, I verified
> that on a Mellanox/Nvidia ConnectX-6 adapter plugged straight into the
> root port of a x86 system still gets AtomicOp Requests enabled.
>
> Due to a lack of the required hardware, I did not test this with any PCIe
> switches between root port and endpoint. So test exposure in other
> environments is highly appreciated.
>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Thanks, applied to pci/atomics for v7.1 with minor rework of 2/3.
> ---
> Changes in v7:
> - Prepend series with a patch to explicitly exclude RCiEPs from
> enablement of AtomicOps Requests
> - Limit the core patch 2 to enforce a full check of the entire
> PCIe hierarchy for support of AtomicOps capabilities.
> - Rebase to v7.0-rc6
> - Link to v6: https://lore.kernel.org/r/20260325-fix_pciatops-v6-0-10bf19d76dd1@linux.ibm.com
>
> Changes in v6:
> - Incorporate Ilpo's editorial comments.
> - Correct logic in pci_is_atomicops_capable_rp() (annotated by Sashiko)
> - Link to v5: https://lore.kernel.org/r/20260323-fix_pciatops-v5-0-fada7233aea8@linux.ibm.com
>
> Changes in v5:
> - Introduce new pcibios_connects_to_atomicops_capable_rc() so arch's can
> declare AtomicOps support outside of PCIe config space. Defaults to
> "true" - except s390.
> - rebase to 7.0-rc5
> - Link to v4: https://lore.kernel.org/r/20260313-fix_pciatops-v4-0-93bc70a63935@linux.ibm.com
>
> Changes in v4:
> - drop patch 1 - it will become the base of a new series
> - previous patch 2, now 1: reword commit message
> - add a new patch to update references to PCI spec within
> pci_enable_atomic_ops_to_root()
> - rebase to latest master
> - Link to v3: https://lore.kernel.org/r/20260306-fix_pciatops-v3-0-99d12bcafb19@linux.ibm.com
>
> Changes in v3:
> - rebase to 7.0-rc2
> - gentle ping
> - add netdev and rdma lists for awareness
> - Link to v2: https://lore.kernel.org/r/20251216-fix_pciatops-v2-0-d013e9b7e2ee@linux.ibm.com
>
> Changes in v2:
> - rebase to 6.19-rc1
> - otherwise unchanged to v1
> - Link to v1: https://lore.kernel.org/r/20251110-fix_pciatops-v1-0-edc58a57b62e@linux.ibm.com
>
> ---
> Gerd Bayer (3):
> PCI: AtomicOps: Do not enable requests by RCiEPs
> PCI: AtomicOps: Do not enable without support in root port
> PCI: AtomicOps: Update references to PCIe spec
>
> drivers/pci/pci.c | 48 ++++++++++++++++++++++++++----------------------
> 1 file changed, 26 insertions(+), 22 deletions(-)
> ---
> base-commit: 7aaa8047eafd0bd628065b15757d9b48c5f9c07d
> change-id: 20251106-fix_pciatops-7e8608eccb03
>
> Best regards,
> --
> Gerd Bayer <gbayer@linux.ibm.com>
>
^ permalink raw reply [flat|nested] 14+ messages in thread