From mboxrd@z Thu Jan 1 00:00:00 1970 From: keith.busch@linux.intel.com (Keith Busch) Date: Tue, 17 Jul 2018 08:54:35 -0600 Subject: [PATCHv4 1/4] nvme: Sync request queues on reset In-Reply-To: <20180717134048.GA16134@lst.de> References: <20180713205609.19701-1-keith.busch@intel.com> <20180713205609.19701-2-keith.busch@intel.com> <20180716153721.GB26265@localhost.localdomain> <20180716171232.GC26265@localhost.localdomain> <20180717134048.GA16134@lst.de> Message-ID: <20180717145435.GD26925@localhost.localdomain> On Tue, Jul 17, 2018@03:40:48PM +0200, Christoph Hellwig wrote: > On Mon, Jul 16, 2018@11:12:33AM -0600, Keith Busch wrote: > > On Mon, Jul 16, 2018@07:36:41PM +0300, Sagi Grimberg wrote: > > > > The only reason we need this is because each namespace has its own > > > > request queue with their own timeout work. We don't want all of these > > > > scheduling multiple controller resets, so the sync here just ensures > > > > that there is no active timeout work that's about to schedule another > > > > reset while we're already resetting the controller. > > > > > > But scheduling a reset while a reset is running should not succeed. You > > > should not be able to change state RESETTING -> RESETTING > > > > Timeout handlers call nvme_dev_disable prior to the reset schedule > > attempt, which is the part that we want to prevent occuring concurrently > > with an already scheduled reset. > > Is there any good way we can get rid of these out of state machine > nvme_dev_disable calls? It might not be easy, but I think it is > going to help us in the long run. Possibly, I'll stare at this a bit more. > I also think your original idea of a single work_struct per tag set > for error handling might be a good idea. This is similar to the > per-host eh thread SCSI had forever, so it might also help SCSI by > getting rid of that and running directly from the block timeout > context. Okay, I'll re-examine that option again. It gets a little messy with the legacy request interface, but that's okay.