From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>,
Sasha Levin <sashal@kernel.org>,
linux-nvme@lists.infradead.org
Subject: [PATCH AUTOSEL 5.8 35/53] nvme-rdma: fix reset hang if controller died in the middle of a reset
Date: Mon, 7 Sep 2020 12:32:01 -0400 [thread overview]
Message-ID: <20200907163220.1280412-35-sashal@kernel.org> (raw)
In-Reply-To: <20200907163220.1280412-1-sashal@kernel.org>
From: Sagi Grimberg <sagi@grimberg.me>
[ Upstream commit 2362acb6785611eda795bfc12e1ea6b202ecf62c ]
If the controller becomes unresponsive in the middle of a reset, we
will hang because we are waiting for the freeze to complete, but that
cannot happen since we have commands that are inflight holding the
q_usage_counter, and we can't blindly fail requests that times out.
So give a timeout and if we cannot wait for queue freeze before
unfreezing, fail and have the error handling take care how to
proceed (either schedule a reconnect of remove the controller).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/rdma.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 99bf88eb812c5..6c07bb55b0f83 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -950,7 +950,15 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
if (!new) {
nvme_start_queues(&ctrl->ctrl);
- nvme_wait_freeze(&ctrl->ctrl);
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ /*
+ * If we timed out waiting for freeze we are likely to
+ * be stuck. Fail the controller initialization just
+ * to be safe.
+ */
+ ret = -ENODEV;
+ goto out_wait_freeze_timed_out;
+ }
blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset,
ctrl->ctrl.queue_count - 1);
nvme_unfreeze(&ctrl->ctrl);
@@ -958,6 +966,9 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
return 0;
+out_wait_freeze_timed_out:
+ nvme_stop_queues(&ctrl->ctrl);
+ nvme_rdma_stop_io_queues(ctrl);
out_cleanup_connect_q:
if (new)
blk_cleanup_queue(ctrl->ctrl.connect_q);
--
2.25.1
next prev parent reply other threads:[~2020-09-07 17:05 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-07 16:31 [PATCH AUTOSEL 5.8 01/53] ARC: HSDK: wireup perf irq Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 02/53] dmaengine: acpi: Put the CSRT table after using it Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 03/53] MIPS: Loongson64: Do not override watch and ejtag feature Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 04/53] netfilter: conntrack: allow sctp hearbeat after connection re-use Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 05/53] rxrpc: Keep the ACK serial in a var in rxrpc_input_ack() Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 06/53] netfilter: nft_set_rbtree: Detect partial overlap with start endpoint match Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 07/53] drivers/net/wan/lapbether: Added needed_tailroom Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 08/53] NFC: st95hf: Fix memleak in st95hf_in_send_cmd Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 09/53] firestream: Fix memleak in fs_open Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 10/53] scsi: qedf: Fix null ptr reference in qedf_stag_change_work Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 11/53] ALSA: hda: Fix 2 channel swapping for Tegra Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 12/53] ALSA: hda/tegra: Program WAKEEN register " Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 13/53] drivers/dma/dma-jz4780: Fix race condition between probe and irq handler Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 14/53] ibmvnic fix NULL tx_pools and rx_tools issue at do_reset Sasha Levin
2020-09-07 21:10 ` Jakub Kicinski
2020-09-07 22:24 ` Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 15/53] net: hns3: Fix for geneve tx checksum bug Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 16/53] xfs: fix off-by-one in inode alloc block reservation calculation Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 17/53] drivers/net/wan/lapbether: Set network_header before transmitting Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 18/53] wireless: fix wrong 160/80+80 MHz setting Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 19/53] cfg80211: regulatory: reject invalid hints Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 20/53] mac80211: reduce packet loss event false positives Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 21/53] cfg80211: Adjust 6 GHz frequency to channel conversion Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 22/53] net: usb: Fix uninit-was-stored issue in asix_read_phy_addr() Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 23/53] xfs: initialize the shortform attr header padding entry Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 24/53] ARC: show_regs: fix r12 printing and simplify Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 25/53] irqchip/eznps: Fix build error for !ARC700 builds Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 26/53] media: gpio-ir-tx: spinlock is not needed to disable interrupts Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 27/53] nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 28/53] nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 29/53] nvme: have nvme_wait_freeze_timeout return if it timed out Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 30/53] nvme-tcp: serialize controller teardown sequences Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 31/53] nvme-tcp: fix timeout handler Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 32/53] nvme-tcp: fix reset hang if controller died in the middle of a reset Sasha Levin
2020-09-07 16:31 ` [PATCH AUTOSEL 5.8 33/53] nvme-rdma: serialize controller teardown sequences Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 34/53] nvme-rdma: fix timeout handler Sasha Levin
2020-09-07 16:32 ` Sasha Levin [this message]
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 36/53] nvme-pci: cancel nvme device request before disabling Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 37/53] HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for all Saitek X52 devices Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 38/53] HID: microsoft: Add rumble support for the 8bitdo SN30 Pro+ controller Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 39/53] drivers/net/wan/hdlc_cisco: Add hard_header_len Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 40/53] HID: elan: Fix memleak in elan_input_configured Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 41/53] ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 42/53] cpufreq: intel_pstate: Refuse to turn off with HWP enabled Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 43/53] cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 44/53] net: usb: dm9601: Add USB ID of Keenetic Plus DSL Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 45/53] arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 46/53] ALSA: hda: hdmi - add Rocketlake support Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 47/53] ALSA: hda: fix a runtime pm issue in SOF when integrated GPU is disabled Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 48/53] ALSA: hda: use consistent HDAudio spelling in comments/docs Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 49/53] drivers/net/wan/hdlc: Change the default of hard_header_len to 0 Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 50/53] drm/amdgpu: Fix bug in reporting voltage for CIK Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 51/53] iommu/amd: Do not force direct mapping when SME is active Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 52/53] iommu/amd: Do not use IOMMUv2 functionality " Sasha Levin
2020-09-07 16:32 ` [PATCH AUTOSEL 5.8 53/53] gcov: Disable gcov build with GCC 10 Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200907163220.1280412-35-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox