From mboxrd@z Thu Jan  1 00:00:00 1970
From: keith.busch@intel.com (Keith Busch)
Date: Tue, 10 Nov 2015 20:28:11 +0000
Subject: [PATCH 09/12] nvme: properly free resources for cancelled command
In-Reply-To: <20151110160344.GB31697@localhost.localdomain>
References: <1446885906-20967-1-git-send-email-hch@lst.de>
 <1446885906-20967-10-git-send-email-hch@lst.de>
 <20151109185731.GB5386@localhost.localdomain>
 <20151109192518.GA10681@lst.de>
 <20151109201232.GC5386@localhost.localdomain>
 <20151110081357.GA21708@lst.de>
 <20151110160344.GB31697@localhost.localdomain>
Message-ID: <20151110202811.GA6506@localhost.localdomain>

On Tue, Nov 10, 2015@04:03:45PM +0000, Keith Busch wrote:
> On Tue, Nov 10, 2015@09:13:57AM +0100, Christoph Hellwig wrote:
> > Set Features for set_queue_count times out we'll call the reset handler,
> > which because we are inside the probe handler will remove the device.
> > How do we care about the return value in that case?
> > 
> > Can you write down a few sentences on why/how we care?  I'll volunteer
> > to put them into the driver in comment form once we have all this sorted
> > out so that anyone touching the driver in the future won't be as confused.
> 
> Perhaps I am thinking how probing serially worked before, and don't
> understand how this works anymore. :)
> 
> You're right, we don't really care anymore if the reset handler unwinds
> it. This path is then safe to see a fake error code.

Actually this still needs to be a negative error so nvme_reset_work
doesn't clear "NVME_CTRL_RESETTING" bit. Without that, the driver gets
in an infinite reset loop.

 
> But the reset handler is the same "work" as probe now, so it won't get
> scheduled. Now I completely understand why we changed nvme_timeout()
> to end the request with -EIO instead of waiting for the reset work to
> cancel it. That's still unsafe since it frees the command for reuse
> while the ID is still technically owned by the controller.