* [PATCH RFC 0/1] Add AtomicOp Requester support @ 2015-08-19 21:10 Jay Cornwall 2015-08-19 21:10 ` [PATCH RFC 1/1] PCI: Add pci_enable_atomic_request Jay Cornwall 2015-09-14 19:58 ` [PATCH RFC 0/1] Add AtomicOp Requester support Bjorn Helgaas 0 siblings, 2 replies; 8+ messages in thread From: Jay Cornwall @ 2015-08-19 21:10 UTC (permalink / raw) To: linux-pci; +Cc: Jay Cornwall The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be requested by, routed through and completed by PCIe components. Routing and completion do not require software support. Component support for each is detectable via the DEVCAP2 register. AtomicOp requests are permitted only if a component's DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be detected but is a no-op if set on a component with no support. These requests can only be serviced if the upstream components support AtomicOp completion and/or routing to a component which does. A concrete example is the AMD Fiji-class GPU, which is specified to support AtomicOp requests, routed through a PLX 8747 switch (advertising AtomicOp routing) to a Haswell host bridge (advertising AtomicOp completion support). When AtomicOp requests are disabled the GPU logs attempts to initiate requests to an MMIO register for debugging. Several approaches for software support might be considered: 1. drivers/pci sets DEVCTL2.ATOMICOP_REQUESTER_ENABLE unconditionally for all endpoints and root ports 2. drivers/pci attempts to establish a routable path to a completer prior to setting DEVCTL2.ATOMICOP_REQUESTER_ENABLE 3. Approach 1/2 with individual drivers (drm/amdgpu in the above example) initiating the request for AtomicOp requester support through a function exported from drivers/pci 4. Individual drivers specify a target component for AtomicOp completion Approach 1 has two drawbacks. There is no guarantee that there is a reachable component which can complete an AtomicOp request. It also prevents individual drivers from blacklisting devices with known incorrect implementations. This might otherwise provide useful diagnostics information. (e.g. AMD GPUs will log an error to an MMIO register if the AtomicOp requester is disabled when an atomic memory request would have been promoted to an AtomicOp.) Approach 2 could only establish that there is a path to at least one completer, but it would not prevent requests being sent to a different device which does not support AtomicOp completion. For example, a root complex might support completion but a transaction could be sent to a different device which does not. The routable guarantee is not precise and so less useful. Approach 3 allows drivers to enable AtomicOp requests on a per-device basis to support blacklisting. A downside is that if AtomicOp support becomes more prevalent it may be undesirable to explicitly enable the feature in individual drivers. Approach 4 is intractable as the target for a transaction is generally known only by the application. DEVCTL2.ATOMICOP_REQUESTER_ENABLE is also a 1:many capability and would not align well with this model. In the absence of an ideal solution, I think approach 3(1) is the most appropriate. I am open to suggestions for an improved implementation. Jay Cornwall (1): PCI: Add pci_enable_atomic_request drivers/pci/pci.c | 23 +++++++++++++++++++++++ include/linux/pci.h | 1 + include/uapi/linux/pci_regs.h | 1 + 3 files changed, 25 insertions(+) -- 1.9.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH RFC 1/1] PCI: Add pci_enable_atomic_request 2015-08-19 21:10 [PATCH RFC 0/1] Add AtomicOp Requester support Jay Cornwall @ 2015-08-19 21:10 ` Jay Cornwall 2015-09-14 19:58 ` [PATCH RFC 0/1] Add AtomicOp Requester support Bjorn Helgaas 1 sibling, 0 replies; 8+ messages in thread From: Jay Cornwall @ 2015-08-19 21:10 UTC (permalink / raw) To: linux-pci; +Cc: Jay Cornwall Allow individual drivers to control AtomicOp Requester PCIe 3.0 capability. This is a no-op on devices which do not support AtomicOp requests. Signed-off-by: Jay Cornwall <jay@jcornwall.me> --- drivers/pci/pci.c | 23 +++++++++++++++++++++++ include/linux/pci.h | 1 + include/uapi/linux/pci_regs.h | 1 + 3 files changed, 25 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 0008c95..1b9d1a9 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2410,6 +2410,29 @@ bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags) } /** + * pci_enable_atomic_request - enable or disable AtomicOp requester + * @dev: the PCI device + */ +void pci_enable_atomic_request(struct pci_dev *dev) +{ + if (!pci_is_pcie(dev)) + return; + + switch (pci_pcie_type(dev)) { + /* PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted + * to implement AtomicOp requester capabilities. */ + case PCI_EXP_TYPE_ENDPOINT: + case PCI_EXP_TYPE_LEG_END: + case PCI_EXP_TYPE_RC_END: + case PCI_EXP_TYPE_ROOT_PORT: + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_ATOMICOP_REQ); + break; + } +} +EXPORT_SYMBOL(pci_enable_atomic_request); + +/** * pci_acs_path_enable - test ACS flags from start to end in a hierarchy * @start: starting downstream device * @end: ending upstream device or NULL to search to the root bus diff --git a/include/linux/pci.h b/include/linux/pci.h index 8a0321a..946c3ce 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1750,6 +1750,7 @@ void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); +void pci_enable_atomic_request(struct pci_dev *dev); #define PCI_VPD_LRDT 0x80 /* Large Resource Data Type */ #define PCI_VPD_LRDT_ID(x) ((x) | PCI_VPD_LRDT) diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index 413417f..d2d2e8d 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -578,6 +578,7 @@ #define PCI_EXP_DEVCTL2 40 /* Device Control 2 */ #define PCI_EXP_DEVCTL2_COMP_TIMEOUT 0x000f /* Completion Timeout Value */ #define PCI_EXP_DEVCTL2_ARI 0x0020 /* Alternative Routing-ID */ +#define PCI_EXP_DEVCTL2_ATOMICOP_REQ 0x0040 /* Allow AtomicOp Requests */ #define PCI_EXP_DEVCTL2_IDO_REQ_EN 0x0100 /* Allow IDO for requests */ #define PCI_EXP_DEVCTL2_IDO_CMP_EN 0x0200 /* Allow IDO for completions */ #define PCI_EXP_DEVCTL2_LTR_EN 0x0400 /* Enable LTR mechanism */ -- 1.9.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-08-19 21:10 [PATCH RFC 0/1] Add AtomicOp Requester support Jay Cornwall 2015-08-19 21:10 ` [PATCH RFC 1/1] PCI: Add pci_enable_atomic_request Jay Cornwall @ 2015-09-14 19:58 ` Bjorn Helgaas 2015-09-21 20:44 ` Jay Cornwall 1 sibling, 1 reply; 8+ messages in thread From: Bjorn Helgaas @ 2015-09-14 19:58 UTC (permalink / raw) To: Jay Cornwall; +Cc: linux-pci Hi Jay, The [0/n] overview text doesn't get included in the changelog, but some of this description is very useful background. So I'd suggest incorporating at least part of it into the patch changelog. On Wed, Aug 19, 2015 at 04:10:01PM -0500, Jay Cornwall wrote: > The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be requested > by, routed through and completed by PCIe components. Routing and completion > do not require software support. Component support for each is detectable via > the DEVCAP2 register. > > AtomicOp requests are permitted only if a component's > DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be > detected but is a no-op if set on a component with no support. These requests > can only be serviced if the upstream components support AtomicOp completion > and/or routing to a component which does. > > A concrete example is the AMD Fiji-class GPU, which is specified to support > AtomicOp requests, routed through a PLX 8747 switch (advertising AtomicOp > routing) to a Haswell host bridge (advertising AtomicOp completion support). > When AtomicOp requests are disabled the GPU logs attempts to initiate requests > to an MMIO register for debugging. > > Several approaches for software support might be considered: > > 1. drivers/pci sets DEVCTL2.ATOMICOP_REQUESTER_ENABLE unconditionally for > all endpoints and root ports > > 2. drivers/pci attempts to establish a routable path to a completer prior to > setting DEVCTL2.ATOMICOP_REQUESTER_ENABLE > > 3. Approach 1/2 with individual drivers (drm/amdgpu in the above example) > initiating the request for AtomicOp requester support through a function > exported from drivers/pci > > 4. Individual drivers specify a target component for AtomicOp completion > > Approach 1 has two drawbacks. There is no guarantee that there is a reachable > component which can complete an AtomicOp request. It also prevents individual > drivers from blacklisting devices with known incorrect implementations. This > might otherwise provide useful diagnostics information. (e.g. AMD GPUs will > log an error to an MMIO register if the AtomicOp requester is disabled when an > atomic memory request would have been promoted to an AtomicOp.) > > Approach 2 could only establish that there is a path to at least one completer, > but it would not prevent requests being sent to a different device which does > not support AtomicOp completion. For example, a root complex might support > completion but a transaction could be sent to a different device which does > not. The routable guarantee is not precise and so less useful. I assume the common usage scenario is to enable AtomicOps for host-to-device and/or device-to-host transactions, and we can ignore device-to-device transactions for now. If I understand correctly, AtomicOps must be supported by all devices along the path, e.g., a Root Port, possibly some Switch Ports, and finally an Endpoint. I guess your worry with Approach 2 is for a scenario like this: 00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing 01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing 02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing 03:00.0: endpoint AtomicOp Completer Supported 02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing 04:00.0: endpoint no AtomicOp Completer support It's true that we wouldn't want to enable AtomicOp routing to 04:00.0, but isn't that what the AtomicOp Egress Blocking bit is for? If we set that in 02:00.1, we should be safe in the sense that AtomicOps targeting 04:00.0 should cause non-fatal errors. > Approach 3 allows drivers to enable AtomicOp requests on a per-device basis to > support blacklisting. A downside is that if AtomicOp support becomes more > prevalent it may be undesirable to explicitly enable the feature in individual > drivers. Your pci_enable_atomic_request() enables AtomicOps for one component. I assume that means the driver would have to map out the topology, figure out whether all the components support AtomicOp routing, and call pci_enable_atomic_request() for the Root Port and the Endpoint. That seems like a lot of grubbing around for a driver. I think a driver should only call pci_enable_atomic_request() for its own device, and the PCI core should figure out whether it can be enabled, and if it can, do everything needed to enable it. > Approach 4 is intractable as the target for a transaction is generally known > only by the application. DEVCTL2.ATOMICOP_REQUESTER_ENABLE is also a 1:many > capability and would not align well with this model. > > In the absence of an ideal solution, I think approach 3(1) is the most > appropriate. I am open to suggestions for an improved implementation. > > Jay Cornwall (1): > PCI: Add pci_enable_atomic_request > > drivers/pci/pci.c | 23 +++++++++++++++++++++++ > include/linux/pci.h | 1 + > include/uapi/linux/pci_regs.h | 1 + > 3 files changed, 25 insertions(+) > > -- > 1.9.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-09-14 19:58 ` [PATCH RFC 0/1] Add AtomicOp Requester support Bjorn Helgaas @ 2015-09-21 20:44 ` Jay Cornwall 2015-09-21 22:46 ` Bjorn Helgaas 0 siblings, 1 reply; 8+ messages in thread From: Jay Cornwall @ 2015-09-21 20:44 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci On 2015-09-14 14:58, Bjorn Helgaas wrote: > On Wed, Aug 19, 2015 at 04:10:01PM -0500, Jay Cornwall wrote: >> Approach 2 could only establish that there is a path to at least one >> completer, >> but it would not prevent requests being sent to a different device >> which does >> not support AtomicOp completion. For example, a root complex might >> support >> completion but a transaction could be sent to a different device which >> does >> not. The routable guarantee is not precise and so less useful. > I assume the common usage scenario is to enable AtomicOps for > host-to-device and/or device-to-host transactions, and we can ignore > device-to-device transactions for now. > > If I understand correctly, AtomicOps must be supported by all devices > along the path, e.g., a Root Port, possibly some Switch Ports, and > finally an Endpoint. I guess your worry with Approach 2 is for a > scenario like this: > > 00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing > 01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing > 02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing > 03:00.0: endpoint AtomicOp Completer Supported > 02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing > 04:00.0: endpoint no AtomicOp Completer support > > It's true that we wouldn't want to enable AtomicOp routing to 04:00.0, > but isn't that what the AtomicOp Egress Blocking bit is for? If we > set that in 02:00.1, we should be safe in the sense that AtomicOps > targeting 04:00.0 should cause non-fatal errors. If 02:00.1 had egress blocking then, if I understand correctly, a 00:1c.0 -> 04:00.0 AtomicOp request would be blocked. Host-to-device and device-to-device look quite similar from this perspective. > Your pci_enable_atomic_request() enables AtomicOps for one component. > I assume that means the driver would have to map out the topology, > figure out whether all the components support AtomicOp routing, and > call pci_enable_atomic_request() for the Root Port and the Endpoint. > That seems like a lot of grubbing around for a driver. > > I think a driver should only call pci_enable_atomic_request() for its > own device, and the PCI core should figure out whether it can be > enabled, and if it can, do everything needed to enable it. OK. I've put together a v2 patch which checks the upstream bridges for AtomicOp routing and the root port for completion. I've left out egress blocking per the above concern. -- Jay Cornwall ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-09-21 20:44 ` Jay Cornwall @ 2015-09-21 22:46 ` Bjorn Helgaas 2015-09-22 0:16 ` Jay Cornwall 0 siblings, 1 reply; 8+ messages in thread From: Bjorn Helgaas @ 2015-09-21 22:46 UTC (permalink / raw) To: Jay Cornwall; +Cc: linux-pci On Mon, Sep 21, 2015 at 03:44:59PM -0500, Jay Cornwall wrote: > On 2015-09-14 14:58, Bjorn Helgaas wrote: > > >On Wed, Aug 19, 2015 at 04:10:01PM -0500, Jay Cornwall wrote: > > >>Approach 2 could only establish that there is a path to at least > >>one completer, > >>but it would not prevent requests being sent to a different > >>device which does > >>not support AtomicOp completion. For example, a root complex > >>might support > >>completion but a transaction could be sent to a different device > >>which does > >>not. The routable guarantee is not precise and so less useful. > > >I assume the common usage scenario is to enable AtomicOps for > >host-to-device and/or device-to-host transactions, and we can ignore > >device-to-device transactions for now. > > > >If I understand correctly, AtomicOps must be supported by all devices > >along the path, e.g., a Root Port, possibly some Switch Ports, and > >finally an Endpoint. I guess your worry with Approach 2 is for a > >scenario like this: > > > >00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing > >01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing > >02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing > >03:00.0: endpoint AtomicOp Completer Supported > >02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing > >04:00.0: endpoint no AtomicOp Completer support > > > >It's true that we wouldn't want to enable AtomicOp routing to 04:00.0, > >but isn't that what the AtomicOp Egress Blocking bit is for? If we > >set that in 02:00.1, we should be safe in the sense that AtomicOps > >targeting 04:00.0 should cause non-fatal errors. > > If 02:00.1 had egress blocking then, if I understand correctly, a > 00:1c.0 -> 04:00.0 AtomicOp request would be blocked. Yes, a 1c.0 -> 04:00.0 AtomicOp request would be blocked, but 04:00.0 doesn't support AtomicOps, so we *want* that request to be blocked, don't we? If 04:00.0 received an AtomicOp, I think it would handle it as a Malformed TLP, which by default is a Fatal Error. If we set AtomicOpEgress Blocking in 02:00.1 and attempt a 1c.0 -> 04:00.0 AtomicOp request, my reading is that 02:00.1 should report an AtomicOp Egress Blocked error, which by default is an Advisory Non-Fatal Error, and 04:00.0 should never receive the AtomicOp. This is from the second-to-last paragraph of PCIe spec r3.0, sec 6.15. Even if we set AtomicOpEgress Blocking in 02:00.1, an AtomicOp to 03:00.0 should work, because that would be routed via 02:00.0, not 02:00.1. Bjorn ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-09-21 22:46 ` Bjorn Helgaas @ 2015-09-22 0:16 ` Jay Cornwall 2015-09-22 16:22 ` Bjorn Helgaas 0 siblings, 1 reply; 8+ messages in thread From: Jay Cornwall @ 2015-09-22 0:16 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci On 2015-09-21 17:46, Bjorn Helgaas wrote: > On Mon, Sep 21, 2015 at 03:44:59PM -0500, Jay Cornwall wrote: >> On 2015-09-14 14:58, Bjorn Helgaas wrote: >> >> >00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing >> >01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing >> >02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing >> >03:00.0: endpoint AtomicOp Completer Supported >> >02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing >> >04:00.0: endpoint no AtomicOp Completer support >> > >> >It's true that we wouldn't want to enable AtomicOp routing to 04:00.0, >> >but isn't that what the AtomicOp Egress Blocking bit is for? If we >> >set that in 02:00.1, we should be safe in the sense that AtomicOps >> >targeting 04:00.0 should cause non-fatal errors. >> >> If 02:00.1 had egress blocking then, if I understand correctly, a >> 00:1c.0 -> 04:00.0 AtomicOp request would be blocked. > > Yes, a 1c.0 -> 04:00.0 AtomicOp request would be blocked, but 04:00.0 > doesn't support AtomicOps, so we *want* that request to be blocked, > don't we? If 04:00.0 received an AtomicOp, I think it would handle it > as a Malformed TLP, which by default is a Fatal Error. I think I was confused by your earlier comment: >> >I assume the common usage scenario is to enable AtomicOps for >> >host-to-device and/or device-to-host transactions, and we can ignore >> >device-to-device transactions for now. 00:1c.0 -> 04:00.0 would be the host-to-device scenario. It's true that 04:00.0 does not support AtomicOp completion in the above example. However, enabling AtomicOp requests for 04:00.0 would cause egress blocks to be set in the 00:1c.0 -> 03:00.0 path. Our immediate use case for AtomicOps is indeed device-to-host, and I expect this is the most common case. I can make a V3 patch which explicitly supports this case only by setting egress blocks on downstream ports of upstream bridges. This would guarantee that AtomicOp requests terminate correctly, at the expense of potential users of host-to-device or device-to-device scenarios. -- Jay Cornwall ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-09-22 0:16 ` Jay Cornwall @ 2015-09-22 16:22 ` Bjorn Helgaas 2015-09-23 16:44 ` Jay Cornwall 0 siblings, 1 reply; 8+ messages in thread From: Bjorn Helgaas @ 2015-09-22 16:22 UTC (permalink / raw) To: Jay Cornwall; +Cc: linux-pci On Mon, Sep 21, 2015 at 07:16:08PM -0500, Jay Cornwall wrote: > On 2015-09-21 17:46, Bjorn Helgaas wrote: > >On Mon, Sep 21, 2015 at 03:44:59PM -0500, Jay Cornwall wrote: > >>On 2015-09-14 14:58, Bjorn Helgaas wrote: > >> > >>>00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing > >>>01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing > >>>02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing > >>>03:00.0: endpoint AtomicOp Completer Supported > >>>02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing > >>>04:00.0: endpoint no AtomicOp Completer support > >>> > >>>It's true that we wouldn't want to enable AtomicOp routing to 04:00.0, > >>>but isn't that what the AtomicOp Egress Blocking bit is for? If we > >>>set that in 02:00.1, we should be safe in the sense that AtomicOps > >>>targeting 04:00.0 should cause non-fatal errors. > >> > >>If 02:00.1 had egress blocking then, if I understand correctly, a > >>00:1c.0 -> 04:00.0 AtomicOp request would be blocked. > > > >Yes, a 1c.0 -> 04:00.0 AtomicOp request would be blocked, but 04:00.0 > >doesn't support AtomicOps, so we *want* that request to be blocked, > >don't we? If 04:00.0 received an AtomicOp, I think it would handle it > >as a Malformed TLP, which by default is a Fatal Error. > > I think I was confused by your earlier comment: > > >>>I assume the common usage scenario is to enable AtomicOps for > >>>host-to-device and/or device-to-host transactions, and we can ignore > >>>device-to-device transactions for now. > > 00:1c.0 -> 04:00.0 would be the host-to-device scenario. It's true > that 04:00.0 does not support AtomicOp completion in the above > example. However, enabling AtomicOp requests for 04:00.0 would cause > egress blocks to be set in the 00:1c.0 -> 03:00.0 path. In this particular example, enabling AtomicOp requests for 04:00.0 should fail and do nothing, because 04:00.0 doesn't support AtomicOps. But you're right that if pci_enable_atomic_request() turns on egress blocking bits, and both 03:00.0 and 04:00.0 support AtomicOps, enabling AtomicOps for one device would block them for the other. I'm not sure how we should use egress blocking. I could imagine setting egress blocking bits for the whole tree at enumeration-time, based on analysis of which root ports and functions have "AtomicOp Complete Supported" bits set. I could also imagine setting egress blocking bits at the time a driver calls pci_enable_atomic_request(), but I don't really like the way this could affect unrelated devices. For example, Case A - Enumerate devices, leaving egress blocking bits cleared - Send 1c.0 -> 04:00.0 AtomicOp - 04:00.0 raises Malformed TLP Fatal Error Case B - Enumerate devices, leaving egress blocking bits cleared - 03:00.0 driver calls pci_enable_atomic_request() - Set 02:00.01 egress blocking bit - Send 1c.0 -> 04:00.0 AtomicOp - 02:00.01 raises AtomicOp Egress Blocked Advisory Non-Fatal Error It seems wrong that the same action (AtomicOp to 04:00.0) causes different errors, depending on an unrelated device. We don't have any guarantee that AtomicOps terminate correctly today, so maybe we don't need to add one right now. If we *do* decide to add one, it seems like it would be better done at enumeration-time. So I think your v2 patch seems like pretty much what we want, at least when we want to set up device-to-host AtomicOps: it validates that the the fabric from device to the root port supports AtomicOp routing, validates that the root port supports AtomicOp completion, and enables AtomicOp requests from the function. The host-to-device case seems a little different, though. In that case, I think we need to turn on AtomicOp Requester Enable in the root port. I think it'd be nicer if the driver didn't have to look up the root port device, and there might be other issues, too. If you don't have a need for this case yet, maybe we should leave it unsupported for now. Bjorn ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RFC 0/1] Add AtomicOp Requester support 2015-09-22 16:22 ` Bjorn Helgaas @ 2015-09-23 16:44 ` Jay Cornwall 0 siblings, 0 replies; 8+ messages in thread From: Jay Cornwall @ 2015-09-23 16:44 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci On 2015-09-22 11:22, Bjorn Helgaas wrote: > On Mon, Sep 21, 2015 at 07:16:08PM -0500, Jay Cornwall wrote: >> On 2015-09-21 17:46, Bjorn Helgaas wrote: >> >On Mon, Sep 21, 2015 at 03:44:59PM -0500, Jay Cornwall wrote: >> >>On 2015-09-14 14:58, Bjorn Helgaas wrote: >> >> >> >>>00:1c.0: PCI bridge to [bus 01-04] Root Port, with AtomicOp Routing >> >>>01:00.0: PCI bridge to [bus 02-04] Upstream Port, with AtomicOp Routing >> >>>02:00.0: PCI bridge to [bus 03] Downstream Port, with AtomicOp Routing >> >>>03:00.0: endpoint AtomicOp Completer Supported >> >>>02:00.1: PCI bridge to [bus 04] Downstream Port, with AtomicOp Routing >> >>>04:00.0: endpoint no AtomicOp Completer support >> >>> >> I think I was confused by your earlier comment: >> >> >>>I assume the common usage scenario is to enable AtomicOps for >> >>>host-to-device and/or device-to-host transactions, and we can ignore >> >>>device-to-device transactions for now. >> >> 00:1c.0 -> 04:00.0 would be the host-to-device scenario. It's true >> that 04:00.0 does not support AtomicOp completion in the above >> example. However, enabling AtomicOp requests for 04:00.0 would cause >> egress blocks to be set in the 00:1c.0 -> 03:00.0 path. > > In this particular example, enabling AtomicOp requests for 04:00.0 > should fail and do nothing, because 04:00.0 doesn't support AtomicOps. > > But you're right that if pci_enable_atomic_request() turns on egress > blocking bits, and both 03:00.0 and 04:00.0 support AtomicOps, > enabling AtomicOps for one device would block them for the other. > > I'm not sure how we should use egress blocking. I could imagine > setting egress blocking bits for the whole tree at enumeration-time, > based on analysis of which root ports and functions have "AtomicOp > Complete Supported" bits set. I think this was my central concern. If we were to do this analysis at enumeration-time then a similar argument might be put forward for setting AtomicOp Requester Enable here as well. This would make pci_enable_atomic_request redundant. Thinking more, though, this case is different because AtomicOp Requester capability cannot be detected. Having a per-device call for drivers with knowledge of this capability would seem reasonable. The alternative would be to set DEVCTL2.ATOMICOP_REQUESTER_ENABLE for all devices at enumeration time. > We don't have any guarantee that AtomicOps terminate correctly today, > so maybe we don't need to add one right now. If we *do* decide to add > one, it seems like it would be better done at enumeration-time. > > So I think your v2 patch seems like pretty much what we want, at least > when we want to set up device-to-host AtomicOps: it validates that the > the fabric from device to the root port supports AtomicOp routing, > validates that the root port supports AtomicOp completion, and enables > AtomicOp requests from the function. > > The host-to-device case seems a little different, though. In that > case, I think we need to turn on AtomicOp Requester Enable in the root > port. I think it'd be nicer if the driver didn't have to look up the > root port device, and there might be other issues, too. If you don't > have a need for this case yet, maybe we should leave it unsupported > for now. Yes, that case is somewhat awkward. Our (AMD's) use case is to allow the CPU and GPU to synchronize access to shared data structures, in a uniform way with our integrated parts. These are located in system memory (so, device->host atomics). I can't think of a good reason to reverse this scenario. (Our GPUs don't support AtomicOp Completion in any case.) I'll address your comments and submit a V3 with no design changes. -- Jay Cornwall ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-09-23 16:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-19 21:10 [PATCH RFC 0/1] Add AtomicOp Requester support Jay Cornwall 2015-08-19 21:10 ` [PATCH RFC 1/1] PCI: Add pci_enable_atomic_request Jay Cornwall 2015-09-14 19:58 ` [PATCH RFC 0/1] Add AtomicOp Requester support Bjorn Helgaas 2015-09-21 20:44 ` Jay Cornwall 2015-09-21 22:46 ` Bjorn Helgaas 2015-09-22 0:16 ` Jay Cornwall 2015-09-22 16:22 ` Bjorn Helgaas 2015-09-23 16:44 ` Jay Cornwall
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).