* hfi1 use of PCI internals @ 2016-06-16 16:20 Bjorn Helgaas 2016-06-16 18:48 ` Ashutosh Dixit 0 siblings, 1 reply; 6+ messages in thread From: Bjorn Helgaas @ 2016-06-16 16:20 UTC (permalink / raw) To: Mike Marciniszyn, Dennis Dalessandro Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, linux-rdma, linux-kernel, linux-pci I noticed drivers/infiniband/hw/hfi1 got moved from staging to drivers/ for v4.7. It does a bunch of grubbing around in PCIe ASPM configuration, e.g., see drivers/infiniband/hw/hfi1/aspm.h. I know there have been lots of ASPM issues, both hardware problems and Linux kernel problems, but it is *supposed* to be manageable by the core, without special driver support. What's the justification for having to do this in the hfi1 driver? Bjorn ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: hfi1 use of PCI internals 2016-06-16 16:20 hfi1 use of PCI internals Bjorn Helgaas @ 2016-06-16 18:48 ` Ashutosh Dixit 2016-06-16 20:08 ` Bjorn Helgaas 0 siblings, 1 reply; 6+ messages in thread From: Ashutosh Dixit @ 2016-06-16 18:48 UTC (permalink / raw) To: Bjorn Helgaas Cc: Marciniszyn, Mike, Dalessandro, Dennis, Doug Ledford, Hefty, Sean, Hal Rosenstock, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Thu, Jun 16 2016 at 12:20:52 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > I noticed drivers/infiniband/hw/hfi1 got moved from staging to > drivers/ for v4.7. It does a bunch of grubbing around in PCIe ASPM > configuration, e.g., see drivers/infiniband/hw/hfi1/aspm.h. > > I know there have been lots of ASPM issues, both hardware problems and > Linux kernel problems, but it is *supposed* to be manageable by the > core, without special driver support. What's the justification for > having to do this in the hfi1 driver? The description for commit affa48de84 "staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM" anticipates this question and describes why this was done in the hfi1 driver: Finally, the kernel ASPM API is not used in this patch. This is because this patch does several non-standard things as SW workarounds for HW issues. As mentioned above, it enables ASPM even when advertised actual latencies are greater than acceptable latencies. Also, whereas the kernel API only allows drivers to disable ASPM from driver probe, this patch enables/disables ASPM directly from interrupt context. Due to these reasons the kernel ASPM API was not used. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: hfi1 use of PCI internals 2016-06-16 18:48 ` Ashutosh Dixit @ 2016-06-16 20:08 ` Bjorn Helgaas 2016-06-17 13:58 ` Dennis Dalessandro 2016-06-17 22:05 ` Ashutosh Dixit 0 siblings, 2 replies; 6+ messages in thread From: Bjorn Helgaas @ 2016-06-16 20:08 UTC (permalink / raw) To: Ashutosh Dixit Cc: Marciniszyn, Mike, Dalessandro, Dennis, Doug Ledford, Hefty, Sean, Hal Rosenstock, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Thu, Jun 16, 2016 at 02:48:30PM -0400, Ashutosh Dixit wrote: > On Thu, Jun 16 2016 at 12:20:52 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > I noticed drivers/infiniband/hw/hfi1 got moved from staging to > > drivers/ for v4.7. It does a bunch of grubbing around in PCIe ASPM > > configuration, e.g., see drivers/infiniband/hw/hfi1/aspm.h. > > > > I know there have been lots of ASPM issues, both hardware problems and > > Linux kernel problems, but it is *supposed* to be manageable by the > > core, without special driver support. What's the justification for > > having to do this in the hfi1 driver? > > The description for commit affa48de84 "staging/rdma/hfi1: Add support > for enabling/disabling PCIe ASPM" anticipates this question and > describes why this was done in the hfi1 driver: > > Finally, the kernel ASPM API is not used in this patch. This is > because this patch does several non-standard things as SW > workarounds for HW issues. As mentioned above, it enables ASPM even > when advertised actual latencies are greater than acceptable > latencies. Also, whereas the kernel API only allows drivers to > disable ASPM from driver probe, this patch enables/disables ASPM > directly from interrupt context. Due to these reasons the kernel > ASPM API was not used. That's a good start, but leads to more questions. For example, it doesn't answer the obvious question of why the driver needs to enable/disable ASPM from interrupt context. Disabling ASPM should only require writing the device's Link Control register. The PCI core could probably provide an interface to do that in interrupt context. Enabling ASPM is not latency-critical and could probably be done from a work queue outside interrupt context, although conceptually there shouldn't be much required here either, and possibly the PCI core interface could be improved. It's possible the latency problem could be handled by some sort of quirk that overrides the acceptable latency. It's hard enough to get ASPM support in the PCI core correct without having to worry about drivers doing their own thing behind the back of the core. As far as I can tell, none of these PCI questions were raised on linux-pci, so we never even had a chance to have a conversation about them. Bjorn ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: hfi1 use of PCI internals 2016-06-16 20:08 ` Bjorn Helgaas @ 2016-06-17 13:58 ` Dennis Dalessandro 2016-06-17 22:05 ` Ashutosh Dixit 1 sibling, 0 replies; 6+ messages in thread From: Dennis Dalessandro @ 2016-06-17 13:58 UTC (permalink / raw) To: Bjorn Helgaas Cc: Ashutosh Dixit, Marciniszyn, Mike, Doug Ledford, Hefty, Sean, Hal Rosenstock, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Thu, Jun 16, 2016 at 03:08:17PM -0500, Bjorn Helgaas wrote: >As far as I can tell, none of these PCI questions were raised on >linux-pci, so we never even had a chance to have a conversation about >them. I'll let Ashutosh handle the technical details here since he is most familiar with the code in question. I just want to mention that the move out of staging doesn't imply the driver is done being developed. We are very much open to discussing whatever the PCI folks see as needing to be addressed. -Denny ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: hfi1 use of PCI internals 2016-06-16 20:08 ` Bjorn Helgaas 2016-06-17 13:58 ` Dennis Dalessandro @ 2016-06-17 22:05 ` Ashutosh Dixit 2016-06-17 23:04 ` Bjorn Helgaas 1 sibling, 1 reply; 6+ messages in thread From: Ashutosh Dixit @ 2016-06-17 22:05 UTC (permalink / raw) To: Bjorn Helgaas Cc: Marciniszyn, Mike, Dalessandro, Dennis, Doug Ledford, Hefty, Sean, Hal Rosenstock, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Thu, Jun 16 2016 at 04:08:17 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > That's a good start, but leads to more questions. For example, it > doesn't answer the obvious question of why the driver needs to > enable/disable ASPM from interrupt context. For power saving reasons we keep ASPM L1 enabled, but implement a heuristic to "quickly" disable ASPM L1 when we notice PCIe traffic (as measured by the interrupt rate) starting up. If interrupt activity ceases ASPM L1 is re-enabled. > Disabling ASPM should only require writing the device's Link Control > register. The PCI core could probably provide an interface to do that > in interrupt context. > > Enabling ASPM is not latency-critical and could probably be done from > a work queue outside interrupt context, although conceptually there > shouldn't be much required here either, and possibly the PCI core > interface could be improved. That is true, to keep latencies low we need to disable ASPM from interrupt context, but re-enabling ASPM is not latency critical. > It's possible the latency problem could be handled by some sort of > quirk that overrides the acceptable latency. Correct, this is another issue that needs to be resolved. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: hfi1 use of PCI internals 2016-06-17 22:05 ` Ashutosh Dixit @ 2016-06-17 23:04 ` Bjorn Helgaas 0 siblings, 0 replies; 6+ messages in thread From: Bjorn Helgaas @ 2016-06-17 23:04 UTC (permalink / raw) To: Ashutosh Dixit Cc: Marciniszyn, Mike, Dalessandro, Dennis, Doug Ledford, Hefty, Sean, Hal Rosenstock, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org On Fri, Jun 17, 2016 at 06:05:43PM -0400, Ashutosh Dixit wrote: > On Thu, Jun 16 2016 at 04:08:17 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > That's a good start, but leads to more questions. For example, it > > doesn't answer the obvious question of why the driver needs to > > enable/disable ASPM from interrupt context. > > For power saving reasons we keep ASPM L1 enabled, but implement a > heuristic to "quickly" disable ASPM L1 when we notice PCIe traffic (as > measured by the interrupt rate) starting up. If interrupt activity > ceases ASPM L1 is re-enabled. > > > Disabling ASPM should only require writing the device's Link Control > > register. The PCI core could probably provide an interface to do that > > in interrupt context. > > > > Enabling ASPM is not latency-critical and could probably be done from > > a work queue outside interrupt context, although conceptually there > > shouldn't be much required here either, and possibly the PCI core > > interface could be improved. > > That is true, to keep latencies low we need to disable ASPM from > interrupt context, but re-enabling ASPM is not latency critical. For endpoint devices, it should be theoretically possible to enable/disable ASPM very quickly, by touching only that device. We don't do that today because pcie/aspm.c does all sorts of buffoonery and path walking. I think that could be simplified, assuming we think this sort of intensive ASPM-management is desirable. > > It's possible the latency problem could be handled by some sort of > > quirk that overrides the acceptable latency. > > Correct, this is another issue that needs to be resolved. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-17 23:04 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-16 16:20 hfi1 use of PCI internals Bjorn Helgaas 2016-06-16 18:48 ` Ashutosh Dixit 2016-06-16 20:08 ` Bjorn Helgaas 2016-06-17 13:58 ` Dennis Dalessandro 2016-06-17 22:05 ` Ashutosh Dixit 2016-06-17 23:04 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).