From mboxrd@z Thu Jan 1 00:00:00 1970 From: keith.busch@intel.com (Busch, Keith) Date: Thu, 22 Oct 2015 15:16:26 +0000 Subject: [PATCH 6/9] nvme: abort requests on the reqeueue list when shutting down a controller In-Reply-To: <20151022145842.GA32062@lst.de> References: <1445515421-4940-1-git-send-email-hch@lst.de> <1445515421-4940-7-git-send-email-hch@lst.de> <20151022144419.GB21840@localhost.localdomain> <20151022145842.GA32062@lst.de> Message-ID: <20151022151625.GC21840@localhost.localdomain> On Thu, Oct 22, 2015@04:58:42PM +0200, Christoph Hellwig wrote: > We're aborting all active commands through nvme_dev_shutdown -> > nvme_clear_queue. Why would we skip commands that were active and are > going to be active again ASAP? This is a bit subtle. nvme_clear_queue at the end of shutdown allows active commands being aborted to requeue based on the state of the namespace's request_queue: if it isn't "dying", nvme_clear_queue won't set NVMe CQE "do-not-retry" (DNR) bit, so req_completion may requeue. In the reset scenario, this change would fail requests that happen to be on the requeue_list, but active commands being cancelled will be allowed to retry. That's not fair to requests that got on the list early. :) But it sounds like you may have found the real fix to the gap in a different patch, so I'll skip this as suggested and continue with the rest of the series.