* [PATCH v3 0/2] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root()
@ 2026-03-06 17:13 Gerd Bayer
2026-03-06 17:13 ` [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities Gerd Bayer
2026-03-06 17:13 ` [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function Gerd Bayer
0 siblings, 2 replies; 7+ messages in thread
From: Gerd Bayer @ 2026-03-06 17:13 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling
Cc: Leon Romanovsky, Niklas Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer, stable
Hi Bjorn et al.
this series addresses a few issues that have come up with the helper
function that enables Atomic Op Requests to be initiated by PCI
enpoints:
A. Most in-tree users of this helper use it incorrectly [0].
B. On s390, Atomic Op Requests are enabled, although the helper
cannot know whether the root port is really supporting them.
C. Loop control in the helper function does not guarantee that a root
port's capabilities are ever checked against those requested by the
caller.
Address these issue with the following patches:
Patch 1: Make it harder to mis-use the enablement function,
Patch 2: Addresses issues B. and C.
I did test that issue B is fixed with these patches. Also, I verified
that Atomic Ops enablement on a Mellanox/Nvidia ConnectX-6 adapter
plugged straight into the root port of a x86 system still gets AtomicOp
Requests enabled. However, I did not test this with any PCIe switches
between root port and endpoint.
Ideally, both patches would be incorporated immediately, so we could
start correcting the mis-uses in the device drivers. I don't know of any
complaints when using Atomic Ops on devices where the driver is
mis-using the helper. Patch 2 however, is fixing an obseved issue.
[0]: https://lore.kernel.org/all/fbe34de16f5c0bf25a16f9819a57fdd81e5bb08c.camel@linux.ibm.com/
[1]: https://lore.kernel.org/all/20251105-mlxatomics-v1-0-10c71649e08d@linux.ibm.com/
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
Changes in v3:
- rebase to 7.0-rc2
- gentle ping
- add netdev and rdma lists for awareness
- Link to v2: https://lore.kernel.org/r/20251216-fix_pciatops-v2-0-d013e9b7e2ee@linux.ibm.com
Changes in v2:
- rebase to 6.19-rc1
- otherwise unchanged to v1
- Link to v1: https://lore.kernel.org/r/20251110-fix_pciatops-v1-0-edc58a57b62e@linux.ibm.com
---
Gerd Bayer (2):
PCI: AtomicOps: Define valid root port capabilities
PCI: AtomicOps: Fix logic in enable function
drivers/pci/pci.c | 43 +++++++++++++++++++++----------------------
include/uapi/linux/pci_regs.h | 8 ++++++++
2 files changed, 29 insertions(+), 22 deletions(-)
---
base-commit: 5ee8dbf54602dc340d6235b1d6aa17c0f283f48c
change-id: 20251106-fix_pciatops-7e8608eccb03
Best regards,
--
Gerd Bayer <gbayer@linux.ibm.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities
2026-03-06 17:13 [PATCH v3 0/2] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
@ 2026-03-06 17:13 ` Gerd Bayer
2026-03-10 21:49 ` Bjorn Helgaas
2026-03-06 17:13 ` [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function Gerd Bayer
1 sibling, 1 reply; 7+ messages in thread
From: Gerd Bayer @ 2026-03-06 17:13 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling
Cc: Leon Romanovsky, Niklas Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer
Provide the two combinations of Atomic Op Completion size attributes
that a root port may support per PCIe Spec 7.0 section 6.15.3.1. -
besides the trivial "No support" - as two new defines.
Change documentation of pci_enable_atomic_ops_to_root() that these are
the only ones that should be used. Also, spell out that all requested
capabilities need to be supported at the root port for enable to
succeed. Also emphasize that on success, this sets AtomicOpsCtl:ReqEn to
1, and leaves it untouched in case of failure.
Suggested-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
drivers/pci/pci.c | 13 +++++++------
include/uapi/linux/pci_regs.h | 8 ++++++++
2 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8479c2e1f74f1044416281aba11bf071ea89488a..cc8abe6b1d07661488895876dbbcf8aaeadf4a17 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3663,15 +3663,16 @@ void pci_acs_init(struct pci_dev *dev)
/**
* pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
* @dev: the PCI device
- * @cap_mask: mask of desired AtomicOp sizes, including one or more of:
- * PCI_EXP_DEVCAP2_ATOMIC_COMP32
- * PCI_EXP_DEVCAP2_ATOMIC_COMP64
- * PCI_EXP_DEVCAP2_ATOMIC_COMP128
+ * @cap_mask: root port must support combinations of AtomicOp sizes
+ * PCI_EXP_ROOT_PORT_ATOMIC_BASE
+ * PCI_EXP_ROOT_PORT_ATOMIC_FULL
*
* Return 0 if all upstream bridges support AtomicOp routing, egress
* blocking is disabled on all upstream ports, and the root port supports
- * the requested completion capabilities (32-bit, 64-bit and/or 128-bit
- * AtomicOp completion), or negative otherwise.
+ * all the requested completion capabilities (BASE: 32-bit, 64-bit or
+ * FULL: 32/64- and 128-bit AtomicOp completion). In that case enable the
+ * device to send AtomicOp requests. Otherwise, return negative and leave
+ * the enablement in the PCI config space untouched.
*/
int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
{
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 14f634ab9350d5442192162225b5e5202dbe2308..63ac62b882a94c6873a0db433ba808332ddbea04 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -669,6 +669,14 @@
#define PCI_EXP_DEVCAP2_ATOMIC_COMP32 0x00000080 /* 32b AtomicOp completion */
#define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* 64b AtomicOp completion */
#define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion */
+/* PCIe spec 7.0 6.15.3.1: Root ports may support one of 2 sets of Atomic Ops */
+#define PCI_EXP_ROOT_PORT_ATOMIC_BASE \
+ (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
+ PCI_EXP_DEVCAP2_ATOMIC_COMP64)
+#define PCI_EXP_ROOT_PORT_ATOMIC_FULL \
+ (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
+ PCI_EXP_DEVCAP2_ATOMIC_COMP64 | \
+ PCI_EXP_DEVCAP2_ATOMIC_COMP128)
#define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */
#define PCI_EXP_DEVCAP2_TPH_COMP_MASK 0x00003000 /* TPH completer support */
#define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function
2026-03-06 17:13 [PATCH v3 0/2] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
2026-03-06 17:13 ` [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities Gerd Bayer
@ 2026-03-06 17:13 ` Gerd Bayer
2026-03-10 21:52 ` Bjorn Helgaas
1 sibling, 1 reply; 7+ messages in thread
From: Gerd Bayer @ 2026-03-06 17:13 UTC (permalink / raw)
To: Bjorn Helgaas, Jay Cornwall, Felix Kuehling
Cc: Leon Romanovsky, Niklas Schnelle, Alexander Schmidt, linux-s390,
linux-pci, linux-kernel, netdev, linux-rdma, Gerd Bayer, stable
Move the check for root port requirements past the loop within
pci_enable_atomic_ops_to_root() that checks on potential switch
(up- and downstream) ports.
Inside the loop traversing the PCI tree upwards, prepend the switch case
to validate the routing capability on any port with a fallthrough-case
that does the additional check for Atomic Ops not being blocked on
upstream ports.
Do not enable Atomic Op Requests if nothing can be learned about how the
device is attached - e.g. if it is on an "isolated" bus, as in s390.
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
drivers/pci/pci.c | 30 ++++++++++++++----------------
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index cc8abe6b1d07661488895876dbbcf8aaeadf4a17..23db6ad5f310ed009a9b2ca4933c7498e0d22b85 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3677,7 +3677,7 @@ void pci_acs_init(struct pci_dev *dev)
int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
{
struct pci_bus *bus = dev->bus;
- struct pci_dev *bridge;
+ struct pci_dev *bridge = NULL;
u32 cap, ctl2;
/*
@@ -3715,29 +3715,27 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
switch (pci_pcie_type(bridge)) {
/* Ensure switch ports support AtomicOp routing */
case PCI_EXP_TYPE_UPSTREAM:
- case PCI_EXP_TYPE_DOWNSTREAM:
- if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
- return -EINVAL;
- break;
-
- /* Ensure root port supports all the sizes we care about */
- case PCI_EXP_TYPE_ROOT_PORT:
- if ((cap & cap_mask) != cap_mask)
- return -EINVAL;
- break;
- }
-
- /* Ensure upstream ports don't block AtomicOps on egress */
- if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
+ /* Upstream ports must not block AtomicOps on egress */
pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
&ctl2);
if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
return -EINVAL;
+ fallthrough;
+ /* All switch ports need to route AtomicOps */
+ case PCI_EXP_TYPE_DOWNSTREAM:
+ if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+ return -EINVAL;
+ break;
}
-
bus = bus->parent;
}
+ /* Finally, last bridge must be root port and support requested sizes */
+ if ((!bridge) ||
+ (pci_pcie_type(bridge) != PCI_EXP_TYPE_ROOT_PORT) ||
+ ((cap & cap_mask) != cap_mask))
+ return -EINVAL;
+
pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
PCI_EXP_DEVCTL2_ATOMIC_REQ);
return 0;
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities
2026-03-06 17:13 ` [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities Gerd Bayer
@ 2026-03-10 21:49 ` Bjorn Helgaas
2026-03-11 10:43 ` Gerd Bayer
0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2026-03-10 21:49 UTC (permalink / raw)
To: Gerd Bayer
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Leon Romanovsky,
Niklas Schnelle, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma
On Fri, Mar 06, 2026 at 06:13:58PM +0100, Gerd Bayer wrote:
> Provide the two combinations of Atomic Op Completion size attributes
> that a root port may support per PCIe Spec 7.0 section 6.15.3.1. -
> besides the trivial "No support" - as two new defines.
>
> Change documentation of pci_enable_atomic_ops_to_root() that these are
> the only ones that should be used. Also, spell out that all requested
> capabilities need to be supported at the root port for enable to
> succeed. Also emphasize that on success, this sets AtomicOpsCtl:ReqEn to
> 1, and leaves it untouched in case of failure.
>
> Suggested-by: Leon Romanovsky <leon@kernel.org>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> drivers/pci/pci.c | 13 +++++++------
> include/uapi/linux/pci_regs.h | 8 ++++++++
> 2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8479c2e1f74f1044416281aba11bf071ea89488a..cc8abe6b1d07661488895876dbbcf8aaeadf4a17 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3663,15 +3663,16 @@ void pci_acs_init(struct pci_dev *dev)
> /**
> * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
> * @dev: the PCI device
> - * @cap_mask: mask of desired AtomicOp sizes, including one or more of:
> - * PCI_EXP_DEVCAP2_ATOMIC_COMP32
> - * PCI_EXP_DEVCAP2_ATOMIC_COMP64
> - * PCI_EXP_DEVCAP2_ATOMIC_COMP128
> + * @cap_mask: root port must support combinations of AtomicOp sizes
> + * PCI_EXP_ROOT_PORT_ATOMIC_BASE
> + * PCI_EXP_ROOT_PORT_ATOMIC_FULL
> *
> * Return 0 if all upstream bridges support AtomicOp routing, egress
> * blocking is disabled on all upstream ports, and the root port supports
> - * the requested completion capabilities (32-bit, 64-bit and/or 128-bit
> - * AtomicOp completion), or negative otherwise.
> + * all the requested completion capabilities (BASE: 32-bit, 64-bit or
> + * FULL: 32/64- and 128-bit AtomicOp completion). In that case enable the
> + * device to send AtomicOp requests. Otherwise, return negative and leave
> + * the enablement in the PCI config space untouched.
> */
> int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> {
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 14f634ab9350d5442192162225b5e5202dbe2308..63ac62b882a94c6873a0db433ba808332ddbea04 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -669,6 +669,14 @@
> #define PCI_EXP_DEVCAP2_ATOMIC_COMP32 0x00000080 /* 32b AtomicOp completion */
> #define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* 64b AtomicOp completion */
> #define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion */
> +/* PCIe spec 7.0 6.15.3.1: Root ports may support one of 2 sets of Atomic Ops */
> +#define PCI_EXP_ROOT_PORT_ATOMIC_BASE \
> + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
> + PCI_EXP_DEVCAP2_ATOMIC_COMP64)
> +#define PCI_EXP_ROOT_PORT_ATOMIC_FULL \
> + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
> + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | \
> + PCI_EXP_DEVCAP2_ATOMIC_COMP128)
I'm sort of ambivalent about this patch, partly because it adds
these #defines that aren't used anywhere. Also, the "BASE" and "FULL"
names don't contain as much information as mentioning COMP32, COMP64,
and COMP128 does.
If we *do* want this, I think these combo definitions are beyond the
scope of uapi/linux/pci_regs.h, which generally is just
transliteration of register bits from the spec. They could possibly
go in linux/pci.h where pci_enable_atomic_ops_to_root() is declared.
> #define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */
> #define PCI_EXP_DEVCAP2_TPH_COMP_MASK 0x00003000 /* TPH completer support */
> #define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function
2026-03-06 17:13 ` [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function Gerd Bayer
@ 2026-03-10 21:52 ` Bjorn Helgaas
2026-03-11 12:19 ` Gerd Bayer
0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2026-03-10 21:52 UTC (permalink / raw)
To: Gerd Bayer
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Leon Romanovsky,
Niklas Schnelle, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, stable
On Fri, Mar 06, 2026 at 06:13:59PM +0100, Gerd Bayer wrote:
> Move the check for root port requirements past the loop within
> pci_enable_atomic_ops_to_root() that checks on potential switch
> (up- and downstream) ports.
>
> Inside the loop traversing the PCI tree upwards, prepend the switch case
> to validate the routing capability on any port with a fallthrough-case
> that does the additional check for Atomic Ops not being blocked on
> upstream ports.
Thanks for looking at this. I think this makes good sense, and I'd
like to:
- Hoist the problem description up here. IIUC we enable AtomicOps on
s390 when we shouldn't, which presumably leads to some problem. I
think the same could happen anywhere we don't have a Root Port,
e.g., jailhouse, loongarch, maybe some VMM guests?
- Reduce or remove the text above, which is basically C code
translated to English, and move it down after the problem
description, so we can state the problem and symptom, followed by
the solution.
I think the core is (as you say below) that if there's no Root Port,
we previously allowed endpoints to use AtomicOps even in cases where
we don't know if the recipient supports them.
That *sounds* bad, and if you actually saw some kind of corruption as
a result, that would make this very compelling.
> Do not enable Atomic Op Requests if nothing can be learned about how the
> device is attached - e.g. if it is on an "isolated" bus, as in s390.
>
> Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
If there's any public report of the problem, include the URL here.
> Cc: stable@vger.kernel.org
> Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> drivers/pci/pci.c | 30 ++++++++++++++----------------
> 1 file changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index cc8abe6b1d07661488895876dbbcf8aaeadf4a17..23db6ad5f310ed009a9b2ca4933c7498e0d22b85 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3677,7 +3677,7 @@ void pci_acs_init(struct pci_dev *dev)
> int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> {
> struct pci_bus *bus = dev->bus;
> - struct pci_dev *bridge;
> + struct pci_dev *bridge = NULL;
> u32 cap, ctl2;
>
> /*
> @@ -3715,29 +3715,27 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
Since we're looking at this, I think we should update the spec
references in this function (in a separate patch).
* Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit
* in Device Control 2 is reserved in VFs and the PF value applies
* to all associated VFs.
It looks like the AtomicOp Requester Enable part of PCIe r5.0, sec
9.3.5.10, was incorporated into the Device Control 2 Register
description in PCIe r7.0, sec 7.5.3.16.
* Per PCIe r4.0, sec 6.15, endpoints and root ports may be
* AtomicOp requesters. For now, we only support endpoints as
* requesters and root ports as completers. No endpoints as
* completers, and no peer-to-peer.
This looks like PCIe r7.0, sec 6.15. Same section as r4.0, but we
should at least make both of these refer to the same spec revision.
> switch (pci_pcie_type(bridge)) {
> /* Ensure switch ports support AtomicOp routing */
> case PCI_EXP_TYPE_UPSTREAM:
> - case PCI_EXP_TYPE_DOWNSTREAM:
> - if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> - return -EINVAL;
> - break;
> -
> - /* Ensure root port supports all the sizes we care about */
> - case PCI_EXP_TYPE_ROOT_PORT:
> - if ((cap & cap_mask) != cap_mask)
> - return -EINVAL;
> - break;
> - }
> -
> - /* Ensure upstream ports don't block AtomicOps on egress */
> - if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
> + /* Upstream ports must not block AtomicOps on egress */
> pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
> &ctl2);
> if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
> return -EINVAL;
> + fallthrough;
> + /* All switch ports need to route AtomicOps */
> + case PCI_EXP_TYPE_DOWNSTREAM:
> + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> + return -EINVAL;
> + break;
> }
> -
> bus = bus->parent;
> }
>
> + /* Finally, last bridge must be root port and support requested sizes */
> + if ((!bridge) ||
> + (pci_pcie_type(bridge) != PCI_EXP_TYPE_ROOT_PORT) ||
> + ((cap & cap_mask) != cap_mask))
> + return -EINVAL;
> +
> pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> PCI_EXP_DEVCTL2_ATOMIC_REQ);
> return 0;
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities
2026-03-10 21:49 ` Bjorn Helgaas
@ 2026-03-11 10:43 ` Gerd Bayer
0 siblings, 0 replies; 7+ messages in thread
From: Gerd Bayer @ 2026-03-11 10:43 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Leon Romanovsky,
Niklas Schnelle, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, Gerd Bayer
On Tue, 2026-03-10 at 16:49 -0500, Bjorn Helgaas wrote:
> On Fri, Mar 06, 2026 at 06:13:58PM +0100, Gerd Bayer wrote:
> > Provide the two combinations of Atomic Op Completion size attributes
> > that a root port may support per PCIe Spec 7.0 section 6.15.3.1. -
> > besides the trivial "No support" - as two new defines.
> >
> > Change documentation of pci_enable_atomic_ops_to_root() that these are
> > the only ones that should be used. Also, spell out that all requested
> > capabilities need to be supported at the root port for enable to
> > succeed. Also emphasize that on success, this sets AtomicOpsCtl:ReqEn to
> > 1, and leaves it untouched in case of failure.
> >
> > Suggested-by: Leon Romanovsky <leon@kernel.org>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> > drivers/pci/pci.c | 13 +++++++------
> > include/uapi/linux/pci_regs.h | 8 ++++++++
> > 2 files changed, 15 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 8479c2e1f74f1044416281aba11bf071ea89488a..cc8abe6b1d07661488895876dbbcf8aaeadf4a17 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3663,15 +3663,16 @@ void pci_acs_init(struct pci_dev *dev)
> > /**
> > * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
> > * @dev: the PCI device
> > - * @cap_mask: mask of desired AtomicOp sizes, including one or more of:
> > - * PCI_EXP_DEVCAP2_ATOMIC_COMP32
> > - * PCI_EXP_DEVCAP2_ATOMIC_COMP64
> > - * PCI_EXP_DEVCAP2_ATOMIC_COMP128
> > + * @cap_mask: root port must support combinations of AtomicOp sizes
> > + * PCI_EXP_ROOT_PORT_ATOMIC_BASE
> > + * PCI_EXP_ROOT_PORT_ATOMIC_FULL
> > *
> > * Return 0 if all upstream bridges support AtomicOp routing, egress
> > * blocking is disabled on all upstream ports, and the root port supports
> > - * the requested completion capabilities (32-bit, 64-bit and/or 128-bit
> > - * AtomicOp completion), or negative otherwise.
> > + * all the requested completion capabilities (BASE: 32-bit, 64-bit or
> > + * FULL: 32/64- and 128-bit AtomicOp completion). In that case enable the
> > + * device to send AtomicOp requests. Otherwise, return negative and leave
> > + * the enablement in the PCI config space untouched.
> > */
> > int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> > {
> > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> > index 14f634ab9350d5442192162225b5e5202dbe2308..63ac62b882a94c6873a0db433ba808332ddbea04 100644
> > --- a/include/uapi/linux/pci_regs.h
> > +++ b/include/uapi/linux/pci_regs.h
> > @@ -669,6 +669,14 @@
> > #define PCI_EXP_DEVCAP2_ATOMIC_COMP32 0x00000080 /* 32b AtomicOp completion */
> > #define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* 64b AtomicOp completion */
> > #define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion */
> > +/* PCIe spec 7.0 6.15.3.1: Root ports may support one of 2 sets of Atomic Ops */
> > +#define PCI_EXP_ROOT_PORT_ATOMIC_BASE \
> > + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
> > + PCI_EXP_DEVCAP2_ATOMIC_COMP64)
> > +#define PCI_EXP_ROOT_PORT_ATOMIC_FULL \
> > + (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | \
> > + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | \
> > + PCI_EXP_DEVCAP2_ATOMIC_COMP128)
>
> I'm sort of ambivalent about this patch, partly because it adds
> these #defines that aren't used anywhere. Also, the "BASE" and "FULL"
> names don't contain as much information as mentioning COMP32, COMP64,
> and COMP128 does.
Hi Bjorn,
I see your point. This patch is better suited to lead into a separate
small series that continues on to actually propose corrections of
today's (mis-)use of pci_enable_atomic_ops_to_root() in the
corresponding device drivers.
> If we *do* want this, I think these combo definitions are beyond the
> scope of uapi/linux/pci_regs.h, which generally is just
> transliteration of register bits from the spec. They could possibly
> go in linux/pci.h where pci_enable_atomic_ops_to_root() is declared.
I like this idea, I'll move the "valid combination" defines to
linux/pci.h
>
> > #define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */
> > #define PCI_EXP_DEVCAP2_TPH_COMP_MASK 0x00003000 /* TPH completer support */
> > #define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */
> >
> > --
> > 2.51.0
> >
Thanks, Gerd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function
2026-03-10 21:52 ` Bjorn Helgaas
@ 2026-03-11 12:19 ` Gerd Bayer
0 siblings, 0 replies; 7+ messages in thread
From: Gerd Bayer @ 2026-03-11 12:19 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Jay Cornwall, Felix Kuehling, Leon Romanovsky,
Niklas Schnelle, Alexander Schmidt, linux-s390, linux-pci,
linux-kernel, netdev, linux-rdma, stable, Gerd Bayer
On Tue, 2026-03-10 at 16:52 -0500, Bjorn Helgaas wrote:
> On Fri, Mar 06, 2026 at 06:13:59PM +0100, Gerd Bayer wrote:
> > Move the check for root port requirements past the loop within
> > pci_enable_atomic_ops_to_root() that checks on potential switch
> > (up- and downstream) ports.
> >
> > Inside the loop traversing the PCI tree upwards, prepend the switch case
> > to validate the routing capability on any port with a fallthrough-case
> > that does the additional check for Atomic Ops not being blocked on
> > upstream ports.
>
> Thanks for looking at this. I think this makes good sense, and I'd
> like to:
>
> - Hoist the problem description up here. IIUC we enable AtomicOps on
> s390 when we shouldn't, which presumably leads to some problem. I
> think the same could happen anywhere we don't have a Root Port,
> e.g., jailhouse, loongarch, maybe some VMM guests?
A few things need to align here in order to observe the bug:
- architecture/configuration w/o Root Port knowledge
- PCIe device with AtomicOps support
- device driver that requests the AtomicOps enablement at the device
Unfortunately, I don't have access to any other combination that may
fail this way. However, I do have access to an x86 system to verify
that this does not generate an (immediate) regression.
> - Reduce or remove the text above, which is basically C code
> translated to English, and move it down after the problem
> description, so we can state the problem and symptom, followed by
> the solution.
Makes sense: I'll focus on the actual issue in the commit message here
and spin off a new series with patch 1.
> I think the core is (as you say below) that if there's no Root Port,
> we previously allowed endpoints to use AtomicOps even in cases where
> we don't know if the recipient supports them.
>
> That *sounds* bad, and if you actually saw some kind of corruption as
> a result, that would make this very compelling.
So far, we have not seen any real functional fall-out on s390 due to
this bug. Our current use-cases of Mellanox/Nvidia's ConnectX adapters
do not seem to lead to the adapter's exploitation of PCIe AtomicOps.
However driver init succeeds to enable AtomicOps Requests as can be
observed with lspci.
> > Do not enable Atomic Op Requests if nothing can be learned about how the
> > device is attached - e.g. if it is on an "isolated" bus, as in s390.
> >
> > Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
>
> If there's any public report of the problem, include the URL here.
I can offer excerpts of output from `lspci`, only.
> > Cc: stable@vger.kernel.org
> > Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> > drivers/pci/pci.c | 30 ++++++++++++++----------------
> > 1 file changed, 14 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index cc8abe6b1d07661488895876dbbcf8aaeadf4a17..23db6ad5f310ed009a9b2ca4933c7498e0d22b85 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3677,7 +3677,7 @@ void pci_acs_init(struct pci_dev *dev)
> > int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
> > {
> > struct pci_bus *bus = dev->bus;
> > - struct pci_dev *bridge;
> > + struct pci_dev *bridge = NULL;
> > u32 cap, ctl2;
> >
> > /*
> > @@ -3715,29 +3715,27 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
>
> Since we're looking at this, I think we should update the spec
> references in this function (in a separate patch).
>
> * Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit
> * in Device Control 2 is reserved in VFs and the PF value applies
> * to all associated VFs.
>
> It looks like the AtomicOp Requester Enable part of PCIe r5.0, sec
> 9.3.5.10, was incorporated into the Device Control 2 Register
> description in PCIe r7.0, sec 7.5.3.16.
>
> * Per PCIe r4.0, sec 6.15, endpoints and root ports may be
> * AtomicOp requesters. For now, we only support endpoints as
> * requesters and root ports as completers. No endpoints as
> * completers, and no peer-to-peer.
>
> This looks like PCIe r7.0, sec 6.15. Same section as r4.0, but we
> should at least make both of these refer to the same spec revision.
>
Fair request... Will clean up in a separate patch.
> > switch (pci_pcie_type(bridge)) {
> > /* Ensure switch ports support AtomicOp routing */
> > case PCI_EXP_TYPE_UPSTREAM:
> > - case PCI_EXP_TYPE_DOWNSTREAM:
> > - if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> > - return -EINVAL;
> > - break;
> > -
> > - /* Ensure root port supports all the sizes we care about */
> > - case PCI_EXP_TYPE_ROOT_PORT:
> > - if ((cap & cap_mask) != cap_mask)
> > - return -EINVAL;
> > - break;
> > - }
> > -
> > - /* Ensure upstream ports don't block AtomicOps on egress */
> > - if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) {
> > + /* Upstream ports must not block AtomicOps on egress */
> > pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
> > &ctl2);
> > if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
> > return -EINVAL;
> > + fallthrough;
> > + /* All switch ports need to route AtomicOps */
> > + case PCI_EXP_TYPE_DOWNSTREAM:
> > + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
> > + return -EINVAL;
> > + break;
> > }
> > -
> > bus = bus->parent;
> > }
> >
> > + /* Finally, last bridge must be root port and support requested sizes */
> > + if ((!bridge) ||
> > + (pci_pcie_type(bridge) != PCI_EXP_TYPE_ROOT_PORT) ||
> > + ((cap & cap_mask) != cap_mask))
> > + return -EINVAL;
> > +
> > pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> > PCI_EXP_DEVCTL2_ATOMIC_REQ);
> > return 0;
> >
> > --
> > 2.51.0
> >
Thank you,
Gerd
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-11 12:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-06 17:13 [PATCH v3 0/2] PCI: AtomicOps: Fix pci_enable_atomic_ops_to_root() Gerd Bayer
2026-03-06 17:13 ` [PATCH v3 1/2] PCI: AtomicOps: Define valid root port capabilities Gerd Bayer
2026-03-10 21:49 ` Bjorn Helgaas
2026-03-11 10:43 ` Gerd Bayer
2026-03-06 17:13 ` [PATCH v3 2/2] PCI: AtomicOps: Fix logic in enable function Gerd Bayer
2026-03-10 21:52 ` Bjorn Helgaas
2026-03-11 12:19 ` Gerd Bayer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox