* PCIe resets/restore and lack of CRS wait @ 2018-03-22 4:03 Benjamin Herrenschmidt 2018-03-22 4:36 ` okaya 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2018-03-22 4:03 UTC (permalink / raw) To: linux-pci, Bjorn Helgaas; +Cc: Michael Neuling Hi Folks ! So while chasing some issues in our EEH error handling, we noticed that the generic code has about a bazillion of "reset" path for devices, most of them seemingly missing a wait for CRS after the reset. That includes PM based resets or wakeups (can a D3->D0 transition cause CRS to be returned ? Unclear but we should try to be safe), but mostly it includes anything that resets the pcie port (PERST) or the secondary bridge reset (hot resets). For example take __pci_reset_function_locked(...), it can call pci_parent_bus_reset() which will perform a hot reset but will *not* wait for CRS. There are a plethora of reset path in there that are similar, it's actually hard to figure out which is what, but they all have in common that they don't wait for CRS with the notable exception of the FLR case. I'm keen on doing a rather "blanket" fix by adding a CRS wait inside pci_dev_restore(). Would you guys agree ? Also why does pci_flr_wait() not use vendor/device ID but instead waits on the COMMAND register being all 1's ? It's not clear to me ... VID/DID will give a very specific signature for CRS which is ffff0001 while COMMAND could return all 1's for other reasons (device unplugged for example). Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 4:03 PCIe resets/restore and lack of CRS wait Benjamin Herrenschmidt @ 2018-03-22 4:36 ` okaya 2018-03-22 5:12 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: okaya @ 2018-03-22 4:36 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On 2018-03-22 00:03, Benjamin Herrenschmidt wrote: > Hi Folks ! > > So while chasing some issues in our EEH error handling, we noticed that > the generic code has about a bazillion of "reset" path for devices, > most of them seemingly missing a wait for CRS after the reset. > > That includes PM based resets or wakeups (can a D3->D0 transition cause > CRS to be returned ? Unclear but we should try to be safe), but mostly > it includes anything that resets the pcie port (PERST) or the secondary > bridge reset (hot resets). > > For example take __pci_reset_function_locked(...), it can call > pci_parent_bus_reset() which will perform a hot reset but will *not* > wait for CRS. > > There are a plethora of reset path in there that are similar, it's > actually hard to figure out which is what, but they all have in common > that they don't wait for CRS with the notable exception of the FLR > case. Bjorn merged my change for 4.17. Kernel should handle these now. > > I'm keen on doing a rather "blanket" fix by adding a CRS wait inside > pci_dev_restore(). Would you guys agree ? > > Also why does pci_flr_wait() not use vendor/device ID but instead waits > on the COMMAND register being all 1's ? It's not clear to me ... > VID/DID will give a very specific signature for CRS which is ffff0001 > while COMMAND could return all 1's for other reasons (device unplugged > for example). > Because if you read vendor id of a virtual function, you get 0xffffffff > Cheers, > Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 4:36 ` okaya @ 2018-03-22 5:12 ` Benjamin Herrenschmidt 2018-03-22 6:24 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2018-03-22 5:12 UTC (permalink / raw) To: okaya; +Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On Thu, 2018-03-22 at 00:36 -0400, okaya@codeaurora.org wrote: > On 2018-03-22 00:03, Benjamin Herrenschmidt wrote: > > Hi Folks ! > > > > So while chasing some issues in our EEH error handling, we noticed that > > the generic code has about a bazillion of "reset" path for devices, > > most of them seemingly missing a wait for CRS after the reset. > > > > That includes PM based resets or wakeups (can a D3->D0 transition cause > > CRS to be returned ? Unclear but we should try to be safe), but mostly > > it includes anything that resets the pcie port (PERST) or the secondary > > bridge reset (hot resets). > > > > For example take __pci_reset_function_locked(...), it can call > > pci_parent_bus_reset() which will perform a hot reset but will *not* > > wait for CRS. > > > > There are a plethora of reset path in there that are similar, it's > > actually hard to figure out which is what, but they all have in common > > that they don't wait for CRS with the notable exception of the FLR > > case. > > Bjorn merged my change for 4.17. Kernel should handle these now. Ah nice ! I'll check that out, I was checking 4.16-rc6 ! > > I'm keen on doing a rather "blanket" fix by adding a CRS wait inside > > pci_dev_restore(). Would you guys agree ? > > > > Also why does pci_flr_wait() not use vendor/device ID but instead waits > > on the COMMAND register being all 1's ? It's not clear to me ... > > VID/DID will give a very specific signature for CRS which is ffff0001 > > while COMMAND could return all 1's for other reasons (device unplugged > > for example). > > > > Because if you read vendor id of a virtual function, you get 0xffffffff Ah indeed, I forgot about that... Thanks ! Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 5:12 ` Benjamin Herrenschmidt @ 2018-03-22 6:24 ` Benjamin Herrenschmidt 2018-03-22 11:25 ` okaya 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2018-03-22 6:24 UTC (permalink / raw) To: okaya; +Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On Thu, 2018-03-22 at 16:12 +1100, Benjamin Herrenschmidt wrote: > > > > I'm keen on doing a rather "blanket" fix by adding a CRS wait inside > > > pci_dev_restore(). Would you guys agree ? > > > > > > Also why does pci_flr_wait() not use vendor/device ID but instead waits > > > on the COMMAND register being all 1's ? It's not clear to me ... > > > VID/DID will give a very specific signature for CRS which is ffff0001 > > > while COMMAND could return all 1's for other reasons (device unplugged > > > for example). > > > > > > > Because if you read vendor id of a virtual function, you get 0xffffffff > > Ah indeed, I forgot about that... Actually, that makes me a bit nervous, I wonder if we should limit that "trick" to VFs and otherwise do the right thing. As per PCIe spec 3.1a (I haven't looked at 4.0 yet) << stalled while the device completes its self-initialization. Software that intends to take advantage of this mechanism must ensure that the first access made to a device following a valid reset condition is a Configuration Read Request accessing both bytes of the Vendor ID field in the device’s Configuration Space header. For this case only, the Root Complex, if enabled, will synthesize a special read-data value for the Vendor ID field to indicate to software that CRS Completion Status has been returned by the device. For other Configuration Requests, or when CRS Software Visibility is not enabled, the Root Complex will generally re-issue the Configuration Request until it completes with a status other than CRS as described in Section 2.3.2. >> That tells me that there is no guarantee by spec that we'll get ffff's, instead we might get HW stalls, or other really nasty effects when probing a register other than 0 (VID/DID) for CRS. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 6:24 ` Benjamin Herrenschmidt @ 2018-03-22 11:25 ` okaya 2018-03-22 13:46 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: okaya @ 2018-03-22 11:25 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On 2018-03-22 02:24, Benjamin Herrenschmidt wrote: > On Thu, 2018-03-22 at 16:12 +1100, Benjamin Herrenschmidt wrote: >> >> > > I'm keen on doing a rather "blanket" fix by adding a CRS wait inside >> > > pci_dev_restore(). Would you guys agree ? >> > > >> > > Also why does pci_flr_wait() not use vendor/device ID but instead waits >> > > on the COMMAND register being all 1's ? It's not clear to me ... >> > > VID/DID will give a very specific signature for CRS which is ffff0001 >> > > while COMMAND could return all 1's for other reasons (device unplugged >> > > for example). >> > > >> > >> > Because if you read vendor id of a virtual function, you get 0xffffffff >> >> Ah indeed, I forgot about that... > > Actually, that makes me a bit nervous, I wonder if we should limit > that "trick" to VFs and otherwise do the right thing. As per PCIe > spec 3.1a (I haven't looked at 4.0 yet) > > << > stalled while the device completes its self-initialization. Software > that intends to take advantage of > this mechanism must ensure that the first access made to a device > following a valid reset condition is > a Configuration Read Request accessing both bytes of the Vendor ID > field in the device’s > Configuration Space header. For this case only, the Root Complex, if > enabled, will synthesize a > special read-data value for the Vendor ID field to indicate to software > that CRS Completion Status > has been returned by the device. For other Configuration Requests, or > when CRS Software > Visibility is not enabled, the Root Complex will generally re-issue the > Configuration Request until it > completes with a status other than CRS as described in Section 2.3.2. >>> > > That tells me that there is no guarantee by spec that we'll get > ffff's, instead we might get HW stalls, or other really nasty > effects when probing a register other than 0 (VID/DID) for CRS. AFAIK, spec also mentions that sw needs to observe 0xffffffff for all other registers other than vendor id during CRS period. > > Cheers, > Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 11:25 ` okaya @ 2018-03-22 13:46 ` Benjamin Herrenschmidt 2018-03-22 13:58 ` Sinan Kaya 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2018-03-22 13:46 UTC (permalink / raw) To: okaya; +Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On Thu, 2018-03-22 at 07:25 -0400, okaya@codeaurora.org wrote: > > > > > > > > That tells me that there is no guarantee by spec that we'll get > > ffff's, instead we might get HW stalls, or other really nasty > > effects when probing a register other than 0 (VID/DID) for CRS. > > AFAIK, spec also mentions that sw needs to observe 0xffffffff for all > other registers other than vendor id during CRS period. This isnt what's in the 3.1a spec at least ... section 2.3.2 explains the specified behaviour which is, for any register other than 0 (VID/DID), to re-issue the request... Now, it can have a timeout, and thus might be completed as a failed transaction after a while, but it's a sub-optimal way (ie, we'll end up hogging the CPU on loads) and it's not 100% clear that a failed transaction returns as all 1's (it should but ...). I can make sure it happens the way the code expects on powerpc, I'm not too worried about that, but I think for such a generic function, it would make sense to stick a bit closer to the spec. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 13:46 ` Benjamin Herrenschmidt @ 2018-03-22 13:58 ` Sinan Kaya 2018-03-22 14:14 ` Bjorn Helgaas 0 siblings, 1 reply; 10+ messages in thread From: Sinan Kaya @ 2018-03-22 13:58 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On 3/22/2018 8:46 AM, Benjamin Herrenschmidt wrote: > On Thu, 2018-03-22 at 07:25 -0400, okaya@codeaurora.org wrote: >>>>> >>> >>> That tells me that there is no guarantee by spec that we'll get >>> ffff's, instead we might get HW stalls, or other really nasty >>> effects when probing a register other than 0 (VID/DID) for CRS. >> >> AFAIK, spec also mentions that sw needs to observe 0xffffffff for all >> other registers other than vendor id during CRS period. > > This isnt what's in the 3.1a spec at least ... section 2.3.2 explains > the specified behaviour which is, for any register other than 0 > (VID/DID), to re-issue the request... > I don't have any hard preference on this. Bjorn wanted code to work for systems with and without CRS capability. That was the reason we stayed away from 0xffff0001. CRS just gives you HW implementation defined retries for non vendor-id register like you mentioned. If device does not reply in this period of polling time, you should get all 1s eventually back. All 1s is the spec way of saying device doesn't exist for config transactions. Some HW can poll indefinitely or other HW can return immediately. If CRS visibility is supported & enabled, polling indefinitely and hogging the CPU wouldn't be the best HW implementation. > Now, it can have a timeout, and thus might be completed as a failed > transaction after a while, but it's a sub-optimal way (ie, we'll end up > hogging the CPU on loads) and it's not 100% clear that a failed > transaction returns as all 1's (it should but ...). > > I can make sure it happens the way the code expects on powerpc, I'm not > too worried about that, but I think for such a generic function, it > would make sense to stick a bit closer to the spec. > Cheers, > Ben. > > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 13:58 ` Sinan Kaya @ 2018-03-22 14:14 ` Bjorn Helgaas 2018-03-22 17:54 ` Sinan Kaya 0 siblings, 1 reply; 10+ messages in thread From: Bjorn Helgaas @ 2018-03-22 14:14 UTC (permalink / raw) To: Sinan Kaya Cc: Benjamin Herrenschmidt, linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On Thu, Mar 22, 2018 at 08:58:06AM -0500, Sinan Kaya wrote: > On 3/22/2018 8:46 AM, Benjamin Herrenschmidt wrote: > > On Thu, 2018-03-22 at 07:25 -0400, okaya@codeaurora.org wrote: > >>> That tells me that there is no guarantee by spec that we'll get > >>> ffff's, instead we might get HW stalls, or other really nasty > >>> effects when probing a register other than 0 (VID/DID) for CRS. > >> > >> AFAIK, spec also mentions that sw needs to observe 0xffffffff for all > >> other registers other than vendor id during CRS period. > > > > This isnt what's in the 3.1a spec at least ... section 2.3.2 explains > > the specified behaviour which is, for any register other than 0 > > (VID/DID), to re-issue the request... > > I don't have any hard preference on this. Bjorn wanted code to work for > systems with and without CRS capability. That was the reason we stayed > away from 0xffff0001. CRS SV is optional, so the code has to work when it's absent. But we can tell whether it's supported, so if we need to, we can do something different when it's absent. > CRS just gives you HW implementation defined retries for non vendor-id > register like you mentioned. If device does not reply in this period > of polling time, you should get all 1s eventually back. > > All 1s is the spec way of saying device doesn't exist for config > transactions. I remember implementation notes mentioning all 1's data returns, e.g., PCIe r4.0, sec 2.3.2, but I don't *think* there's actually a spec requirement that CRS or other errors be reported that way. So if there's a way to avoid relying on all 1's data, I think that would be a good thing. Bjorn ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 14:14 ` Bjorn Helgaas @ 2018-03-22 17:54 ` Sinan Kaya 2018-03-22 22:02 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Sinan Kaya @ 2018-03-22 17:54 UTC (permalink / raw) To: Bjorn Helgaas Cc: Benjamin Herrenschmidt, linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner Ben, On 3/22/2018 9:14 AM, Bjorn Helgaas wrote: > CRS SV is optional, so the code has to work when it's absent. But we > can tell whether it's supported, so if we need to, we can do something > different when it's absent. Let us know if you want to work on this or not. Otherwise, I'll add to the list of things I'm doing. Sinan -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PCIe resets/restore and lack of CRS wait 2018-03-22 17:54 ` Sinan Kaya @ 2018-03-22 22:02 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 10+ messages in thread From: Benjamin Herrenschmidt @ 2018-03-22 22:02 UTC (permalink / raw) To: Sinan Kaya, Bjorn Helgaas Cc: linux-pci, Bjorn Helgaas, Michael Neuling, linux-pci-owner On Thu, 2018-03-22 at 12:54 -0500, Sinan Kaya wrote: > Ben, > > On 3/22/2018 9:14 AM, Bjorn Helgaas wrote: > > CRS SV is optional, so the code has to work when it's absent. But we > > can tell whether it's supported, so if we need to, we can do something > > different when it's absent. > > Let us know if you want to work on this or not. Otherwise, I'll add > to the list of things I'm doing. On my side (powerpc) I'll make sure we do get ffff's and that our retry timeouts aren't too high. I just need to makes sure we don't end up triggering an EEH failure on the timeout though. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-03-22 22:02 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-03-22 4:03 PCIe resets/restore and lack of CRS wait Benjamin Herrenschmidt 2018-03-22 4:36 ` okaya 2018-03-22 5:12 ` Benjamin Herrenschmidt 2018-03-22 6:24 ` Benjamin Herrenschmidt 2018-03-22 11:25 ` okaya 2018-03-22 13:46 ` Benjamin Herrenschmidt 2018-03-22 13:58 ` Sinan Kaya 2018-03-22 14:14 ` Bjorn Helgaas 2018-03-22 17:54 ` Sinan Kaya 2018-03-22 22:02 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).