All of lore.kernel.org
 help / color / mirror / Atom feed
* Getting rid of nvme's pci watchdog timer?
@ 2017-05-23 21:23 Andy Lutomirski
  2017-05-24  9:42 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Andy Lutomirski @ 2017-05-23 21:23 UTC (permalink / raw)


nvme polls pci devices to make sure they're not dead even when they're
completely idle.  Can we fix that without killing scalability?  Can we
piggyback off the block layer's timeout mechanism?  Could we simply
*delete* the watchdog timer and add some CSTS and PCI_STATUS
diagnostics to nvme_timeout()?

I doubt that one rounded wakeup per second makes much difference on
CPU/package power saving, but it could be pretty bad for ASPM and
maybe even APST power saving.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Getting rid of nvme's pci watchdog timer?
  2017-05-23 21:23 Getting rid of nvme's pci watchdog timer? Andy Lutomirski
@ 2017-05-24  9:42 ` Christoph Hellwig
  2017-05-24 11:21   ` Keith Busch
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2017-05-24  9:42 UTC (permalink / raw)


On Tue, May 23, 2017@02:23:50PM -0700, Andy Lutomirski wrote:
> nvme polls pci devices to make sure they're not dead even when they're
> completely idle.  Can we fix that without killing scalability?  Can we
> piggyback off the block layer's timeout mechanism?  Could we simply
> *delete* the watchdog timer and add some CSTS and PCI_STATUS
> diagnostics to nvme_timeout()?
> 
> I doubt that one rounded wakeup per second makes much difference on
> CPU/package power saving, but it could be pretty bad for ASPM and
> maybe even APST power saving.

Note that section 8.4.1 in NVMe says controller should service mmio
access in non-operational power states.  That being said not touching
an idle controller all the time seems pretty useful to me.  I don't
really understand the need for the watchdog fully as it predates my
involvement, but I vaguely remember discussion the issue with Keith
a while ago, so he might be able to help.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Getting rid of nvme's pci watchdog timer?
  2017-05-24  9:42 ` Christoph Hellwig
@ 2017-05-24 11:21   ` Keith Busch
  0 siblings, 0 replies; 3+ messages in thread
From: Keith Busch @ 2017-05-24 11:21 UTC (permalink / raw)


On Wed, May 24, 2017@11:42:19AM +0200, Christoph Hellwig wrote:
> an idle controller all the time seems pretty useful to me.  I don't
> really understand the need for the watchdog fully as it predates my
> involvement, but I vaguely remember discussion the issue with Keith
> a while ago, so he might be able to help.

This was originally added to the driver's health check kthread to
keep a pulse on the devices, but I think it's causing more harm than
good. The hotplug races are bad enough, and this power savings killer is
yet another reason to remove it. I'm in favor of removing the watchdog
timer and moving the status check to the timeout callback so we verify
health only when we have a reason to believe something is wrong.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-05-24 11:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-23 21:23 Getting rid of nvme's pci watchdog timer? Andy Lutomirski
2017-05-24  9:42 ` Christoph Hellwig
2017-05-24 11:21   ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.