From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2D16335BA9; Mon, 18 Aug 2025 13:50:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755525011; cv=none; b=mP/SphUZcO4O5gked6QKfIVOQrXmA64Zo+kwaH0GEqhPPOYVa1Vh/SmsYTFkRE+DUxDMdMWfQ8/g5lAQ9Kd0esc0adyUZt4EZesPIXwtLvdcFFoTJWuu1UJqPQgU/AF/w1TMzDSUBU47D75Tj1BZfdK9KE0yHBpaP266/jxj+BA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755525011; c=relaxed/simple; bh=MJ2XRpJ4/+y3hy+EwBwFJXlMh1Sav2TNG13dkqdFTR4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HH9i9725BsTpJXaENYmnZnl3RVkWKVzKqiYUxIRySpzglQ5SkRSwZeRnJDS7CU3idNc4TgVb0zRWsSwZiJb+9APPIt1dybriq1Snnk6DgSJ/nHDiOOvDypSDQJfTHc/feYsqEKf3B/K3Y+5A7iqD9vTGJuPeGPp0eBucR/oHEb0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=WIK5IQct; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="WIK5IQct" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3E566C4CEEB; Mon, 18 Aug 2025 13:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1755525011; bh=MJ2XRpJ4/+y3hy+EwBwFJXlMh1Sav2TNG13dkqdFTR4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WIK5IQctQzDSD1q0QAgpEexMZJFwAA0WJoBnuhBPgfCdNnlFe2u0v83Iw4zy+f23B IofNEoiBzQW1432Dlc8I7z6oFuAx+NY2CoNOg8bUKNeu92Spn462O93g7Kd+GjQPpQ DKRp5+2lZSsknkJXJXj1h2XVu6H2VzZYwjO6nlv4= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Keith Busch , Chaitanya Kulkarni , Nitesh Shetty , Christoph Hellwig , Sasha Levin Subject: [PATCH 6.16 100/570] nvme-pci: try function level reset on init failure Date: Mon, 18 Aug 2025 14:41:27 +0200 Message-ID: <20250818124509.665447597@linuxfoundation.org> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250818124505.781598737@linuxfoundation.org> References: <20250818124505.781598737@linuxfoundation.org> User-Agent: quilt/0.68 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Keith Busch [ Upstream commit 5b2c214a95942f7997d1916a4c44017becbc3cac ] NVMe devices from multiple vendors appear to get stuck in a reset state that we can't get out of with an NVMe level Controller Reset. The kernel would report these with messages that look like: Device not ready; aborting reset, CSTS=0x1 These have historically required a power cycle to make them usable again, but in many cases, a PCIe FLR is sufficient to restart operation without a power cycle. Try it if the initial controller reset fails during any nvme reset attempt. Signed-off-by: Keith Busch Reviewed-by: Chaitanya Kulkarni Reviewed-by: Nitesh Shetty Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/pci.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 320aaa41ec39..3ef30c36bf10 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1958,8 +1958,28 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev) * might be pointing at! */ result = nvme_disable_ctrl(&dev->ctrl, false); - if (result < 0) - return result; + if (result < 0) { + struct pci_dev *pdev = to_pci_dev(dev->dev); + + /* + * The NVMe Controller Reset method did not get an expected + * CSTS.RDY transition, so something with the device appears to + * be stuck. Use the lower level and bigger hammer PCIe + * Function Level Reset to attempt restoring the device to its + * initial state, and try again. + */ + result = pcie_reset_flr(pdev, false); + if (result < 0) + return result; + + pci_restore_state(pdev); + result = nvme_disable_ctrl(&dev->ctrl, false); + if (result < 0) + return result; + + dev_info(dev->ctrl.device, + "controller reset completed after pcie flr\n"); + } result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH); if (result) -- 2.39.5