From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 600D6335BDD; Mon, 18 Aug 2025 13:21:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755523273; cv=none; b=XwDecyHOwAxWx0SAQFH/OEU+LnLiqkeHU/59eYhsIzU2lLYnhGbbWxuIo5aPa2fkEtpFz1XPI/4xqnSZYsURoIGH7BI6Rm3rl03FCT1BYS0fjUqu/kKuN5vGruB9tQfj8OZ+SEuzRJrTigIGU7jojUX0T2mgFYkEd132fdlwMSQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755523273; c=relaxed/simple; bh=GogLS+T6ZYHLJhVwSgMuqKrX4/Q34AWMGI/W9SHTsnQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t780/YOeiiEV/nKs2aq8H9UJLkgFmH/2z960CXIixd2f9roIgTtobOtpZA7um4gy7ttISbrGxu/HKsOeFBEPN8RRRlljI5ejgVzXzw2K+o9OrVzuyhB8RrGzqeXgf3UgIVMRoLqTrObduQwOEuSenK/Au80lyz+irdg23uyXZB0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=Nn6CgO4g; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="Nn6CgO4g" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5E0DC4CEEB; Mon, 18 Aug 2025 13:21:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1755523273; bh=GogLS+T6ZYHLJhVwSgMuqKrX4/Q34AWMGI/W9SHTsnQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Nn6CgO4g74UQEw/vs5jCARtHvlk6t2+wbmWaPfIyV71wN2bCO3pNMOktvVw6ghMnF F4rY8L886MDgv/f16KdLHKFNhtkVN/+rSC/91fY0iGAqryPizj6hdf1CAwq6vNvp2X iVVF9ou5TTCxoD+vXmweBXmkmOuoeNrpzYfYZRPQ= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Keith Busch , Chaitanya Kulkarni , Nitesh Shetty , Christoph Hellwig , Sasha Levin Subject: [PATCH 6.15 088/515] nvme-pci: try function level reset on init failure Date: Mon, 18 Aug 2025 14:41:14 +0200 Message-ID: <20250818124501.762548538@linuxfoundation.org> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250818124458.334548733@linuxfoundation.org> References: <20250818124458.334548733@linuxfoundation.org> User-Agent: quilt/0.68 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Keith Busch [ Upstream commit 5b2c214a95942f7997d1916a4c44017becbc3cac ] NVMe devices from multiple vendors appear to get stuck in a reset state that we can't get out of with an NVMe level Controller Reset. The kernel would report these with messages that look like: Device not ready; aborting reset, CSTS=0x1 These have historically required a power cycle to make them usable again, but in many cases, a PCIe FLR is sufficient to restart operation without a power cycle. Try it if the initial controller reset fails during any nvme reset attempt. Signed-off-by: Keith Busch Reviewed-by: Chaitanya Kulkarni Reviewed-by: Nitesh Shetty Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/pci.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 776c867fb64d..5396282015a2 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1888,8 +1888,28 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev) * might be pointing at! */ result = nvme_disable_ctrl(&dev->ctrl, false); - if (result < 0) - return result; + if (result < 0) { + struct pci_dev *pdev = to_pci_dev(dev->dev); + + /* + * The NVMe Controller Reset method did not get an expected + * CSTS.RDY transition, so something with the device appears to + * be stuck. Use the lower level and bigger hammer PCIe + * Function Level Reset to attempt restoring the device to its + * initial state, and try again. + */ + result = pcie_reset_flr(pdev, false); + if (result < 0) + return result; + + pci_restore_state(pdev); + result = nvme_disable_ctrl(&dev->ctrl, false); + if (result < 0) + return result; + + dev_info(dev->ctrl.device, + "controller reset completed after pcie flr\n"); + } result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH); if (result) -- 2.39.5