linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Deprecating NVME_IOCTL_SUBSYS_RESET
@ 2018-05-10 15:06 Alex G.
  2018-05-10 16:13 ` Keith Busch
  0 siblings, 1 reply; 4+ messages in thread
From: Alex G. @ 2018-05-10 15:06 UTC (permalink / raw)


Hi,

I've been getting reports that nvme subsystem resets end up taking down 
the entire machine. That's very easy to do with PCIe drives, since a 
NSSR also brings down the PCIe link. Any in-flight posted requests can 
generate unsupported request errors, and non-posted requests can 
generate completion timeouts, or Fatal MCEs on some PCIe root ports.

In a perfect world, PCIe errors would be handled by their respective 
layers, and we wouldn't need to care. Unfortunately, PCIe error handling 
is still an ill conceived idea and afterthought. What concerns me is the 
potential of NSSR to propagate outside of nvme. I suspect other fabrics 
have much better error handling, but I wouldn't be surprised to see 
similar failures.

There are ways to harden the IOCTL by quiescing all IO before issuing 
the actual reset. Such safeguards are implemented everywhere else in the 
driver. Is NVME_IOCTL_SUBSYS_RESET used in the real-world? I think it's 
too big of an attack surface, and we're better off with -EOPNOTSUPP.

I don't see any benefit in keeping it around. Thpughts?

Alex

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-10 16:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-10 15:06 Deprecating NVME_IOCTL_SUBSYS_RESET Alex G.
2018-05-10 16:13 ` Keith Busch
2018-05-10 16:18   ` Alex_Gagniuc
2018-05-10 16:52     ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).