From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753956AbeASGCJ (ORCPT ); Fri, 19 Jan 2018 01:02:09 -0500 Received: from mga01.intel.com ([192.55.52.88]:2713 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751282AbeASGCD (ORCPT ); Fri, 19 Jan 2018 01:02:03 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,380,1511856000"; d="scan'208";a="24611576" Date: Thu, 18 Jan 2018 23:05:21 -0700 From: Keith Busch To: "jianchao.wang" Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, maxg@mellanox.com, james.smart@broadcom.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Message-ID: <20180119060521.GD12043@localhost.localdomain> References: <1516270202-8051-1-git-send-email-jianchao.w.wang@oracle.com> <1516270202-8051-3-git-send-email-jianchao.w.wang@oracle.com> <20180119045944.GC12043@localhost.localdomain> <0b74b36d-ecb5-e9e2-2900-6dc9c9699658@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0b74b36d-ecb5-e9e2-2900-6dc9c9699658@oracle.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 19, 2018 at 01:55:29PM +0800, jianchao.wang wrote: > On 01/19/2018 12:59 PM, Keith Busch wrote: > > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote: > >> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired > >> + * request should come from the previous work and we handle > >> + * it as nvme_cancel_request. > >> + * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired > >> + * request should come from the initializing procedure such as > >> + * setup io queues, because all the previous outstanding > >> + * requests should have been cancelled. > >> */ > >> - if (dev->ctrl.state == NVME_CTRL_RESETTING) { > >> - dev_warn(dev->ctrl.device, > >> - "I/O %d QID %d timeout, disable controller\n", > >> - req->tag, nvmeq->qid); > >> - nvme_dev_disable(dev, false); > >> + switch (dev->ctrl.state) { > >> + case NVME_CTRL_RESETTING: > >> + nvme_req(req)->status = NVME_SC_ABORT_REQ; > >> + return BLK_EH_HANDLED; > >> + case NVME_CTRL_RECONNECTING: > >> + WARN_ON_ONCE(nvmeq->qid); > >> nvme_req(req)->flags |= NVME_REQ_CANCELLED; > >> return BLK_EH_HANDLED; > >> + default: > >> + break; > >> } > > > > The driver may be giving up on the command here, but that doesn't mean > > the controller has. We can't just end the request like this because that > > will release the memory the controller still owns. We must wait until > > after nvme_dev_disable clears bus master because we can't say for sure > > the controller isn't going to write to that address right after we end > > the request. > > > Yes, but the controller is going to be reseted or shutdown at the moment, > even if the controller accesses a bad address and goes wrong, everything will > be ok after reset or shutdown. :) Hm, I don't follow. DMA access after free is never okay.