From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 331099443 for ; Sun, 13 Aug 2023 21:24:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE872C433C8; Sun, 13 Aug 2023 21:24:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1691961887; bh=SxsEya9kUcl2yirQXYpk4xrz84SrBFJe0cqmKjwZj6M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p4nMkSzjYp06k8GIdCCJnPx8o4EPS3n0RlICDloyz49wXdJ4guyrsFhrg2Ruwux7/ 5FRBWTTNTg5aetEwk2vwmlYNt6IoHNIBMzciESkFmLJAUYTexE23bABY4216YttHns rbGkWY6uX03N9oIu+liNOAwGB3rOTSWvFFU/xySY= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Keith Busch , Sagi Grimberg , Chunguang Xu , Ming Lei Subject: [PATCH 6.4 035/206] nvme: fix possible hang when removing a controller during error recovery Date: Sun, 13 Aug 2023 23:16:45 +0200 Message-ID: <20230813211726.006453005@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230813211724.969019629@linuxfoundation.org> References: <20230813211724.969019629@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Ming Lei commit 1b95e817916069ec45a7f259d088fd1c091a8cc6 upstream. Error recovery can be interrupted by controller removal, then the controller is left as quiesced, and IO hang can be caused. Fix the issue by unquiescing controller unconditionally when removing namespaces. This way is reasonable and safe given forward progress can be made when removing namespaces. Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reported-by: Chunguang Xu Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/ Cc: stable@vger.kernel.org Signed-off-by: Ming Lei Signed-off-by: Keith Busch Signed-off-by: Greg Kroah-Hartman --- drivers/nvme/host/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4728,6 +4728,12 @@ void nvme_remove_namespaces(struct nvme_ */ nvme_mpath_clear_ctrl_paths(ctrl); + /* + * Unquiesce io queues so any pending IO won't hang, especially + * those submitted from scan work + */ + nvme_unquiesce_io_queues(ctrl); + /* prevent racing with ns scanning */ flush_work(&ctrl->scan_work); @@ -4737,10 +4743,8 @@ void nvme_remove_namespaces(struct nvme_ * removing the namespaces' disks; fail all the queues now to avoid * potentially having to clean up the failed sync later. */ - if (ctrl->state == NVME_CTRL_DEAD) { + if (ctrl->state == NVME_CTRL_DEAD) nvme_mark_namespaces_dead(ctrl); - nvme_unquiesce_io_queues(ctrl); - } /* this is a no-op when called from the controller reset handler */ nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);