All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jaesoo Lee <jalee@purestorage.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.20 53/88] nvme-rdma: fix timeout handler
Date: Mon,  4 Mar 2019 09:22:36 +0100	[thread overview]
Message-ID: <20190304081632.640708219@linuxfoundation.org> (raw)
In-Reply-To: <20190304081630.610632175@linuxfoundation.org>

4.20-stable review patch.  If anyone has any objections, please let me know.

------------------

[ Upstream commit 4c174e6366746ae8d49f9cc409f728eebb7a9ac9 ]

Currently, we have several problems with the timeout
handler:
1. If we timeout on the controller establishment flow, we will hang
because we don't execute the error recovery (and we shouldn't because
the create_ctrl flow needs to fail and cleanup on its own)
2. We might also hang if we get a disconnet on a queue while the
controller is already deleting. This racy flow can cause the controller
disable/shutdown admin command to hang.

We cannot complete a timed out request from the timeout handler without
mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
So we serialize it in the timeout handler and teardown io and admin
queues to guarantee that no one races with us from completing the
request.

Reported-by: Jaesoo Lee <jalee@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/rdma.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index ab6ec7295bf90..6e24b20304b53 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1679,18 +1679,28 @@ static enum blk_eh_timer_return
 nvme_rdma_timeout(struct request *rq, bool reserved)
 {
 	struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq);
+	struct nvme_rdma_queue *queue = req->queue;
+	struct nvme_rdma_ctrl *ctrl = queue->ctrl;
 
-	dev_warn(req->queue->ctrl->ctrl.device,
-		 "I/O %d QID %d timeout, reset controller\n",
-		 rq->tag, nvme_rdma_queue_idx(req->queue));
+	dev_warn(ctrl->ctrl.device, "I/O %d QID %d timeout\n",
+		 rq->tag, nvme_rdma_queue_idx(queue));
 
-	/* queue error recovery */
-	nvme_rdma_error_recovery(req->queue->ctrl);
+	if (ctrl->ctrl.state != NVME_CTRL_LIVE) {
+		/*
+		 * Teardown immediately if controller times out while starting
+		 * or we are already started error recovery. all outstanding
+		 * requests are completed on shutdown, so we return BLK_EH_DONE.
+		 */
+		flush_work(&ctrl->err_work);
+		nvme_rdma_teardown_io_queues(ctrl, false);
+		nvme_rdma_teardown_admin_queue(ctrl, false);
+		return BLK_EH_DONE;
+	}
 
-	/* fail with DNR on cmd timeout */
-	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
+	dev_warn(ctrl->ctrl.device, "starting error recovery\n");
+	nvme_rdma_error_recovery(ctrl);
 
-	return BLK_EH_DONE;
+	return BLK_EH_RESET_TIMER;
 }
 
 static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
-- 
2.19.1




  parent reply	other threads:[~2019-03-04  8:36 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-04  8:21 [PATCH 4.20 00/88] 4.20.14-stable review Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 01/88] genirq/matrix: Improve target CPU selection for managed interrupts Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 02/88] scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 03/88] drm/msm: Unblock writer if reader closes file Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 04/88] ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 05/88] ALSA: compress: prevent potential divide by zero bugs Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 06/88] ASoC: rt5682: Fix recording no sound issue Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 07/88] ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 08/88] clk: tegra: dfll: Fix a potential Oop in remove() Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 09/88] clk: sysfs: fix invalid JSON in clk_dump Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 10/88] clk: vc5: Abort clock configuration without upstream clock Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 11/88] thermal: int340x_thermal: Fix a NULL vs IS_ERR() check Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 12/88] usb: dwc3: gadget: synchronize_irq dwc irq in suspend Greg Kroah-Hartman
2019-03-04  8:46   ` He, Bo
2019-03-04  8:53     ` Greg Kroah-Hartman
2019-03-04  9:03       ` Marek Szyprowski
2019-03-04  9:44         ` Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 13/88] usb: dwc3: gadget: Fix the uninitialized link_state when udc starts Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 14/88] usb: gadget: Potential NULL dereference on allocation error Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 15/88] HID: i2c-hid: Disable runtime PM on Goodix touchpad Greg Kroah-Hartman
2019-03-04  8:21 ` [PATCH 4.20 16/88] ASoC: core: Make snd_soc_find_component() more robust Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 17/88] selftests: rtc: rtctest: fix alarm tests Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 18/88] selftests: rtc: rtctest: add alarm test on minute boundary Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 19/88] genirq: Make sure the initial affinity is not empty Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 20/88] x86/mm/mem_encrypt: Fix erroneous sizeof() Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 21/88] ASoC: rt5682: Fix PLL source register definitions Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 22/88] ASoC: dapm: change snprintf to scnprintf for possible overflow Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 23/88] ASoC: imx-audmux: " Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 24/88] selftests/vm/gup_benchmark.c: match gup struct to kernel Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 25/88] phy: ath79-usb: Fix the power on error path Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 26/88] phy: ath79-usb: Fix the main reset name to match the DT binding Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 27/88] selftests: seccomp: use LDLIBS instead of LDFLAGS Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 28/88] selftests: gpio-mockup-chardev: Check asprintf() for error Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 29/88] irqchip/gic-v3-mbi: Fix uninitialized mbi_lock Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 30/88] ARC: fix __ffs return value to avoid build warnings Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 31/88] ARC: show_regs: lockdep: avoid page allocator Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 32/88] drivers: thermal: int340x_thermal: Fix sysfs race condition Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 33/88] staging: rtl8723bs: Fix build error with Clang when inlining is disabled Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 34/88] mac80211: fix miscounting of ttl-dropped frames Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 35/88] sched/wait: Fix rcuwait_wake_up() ordering Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 36/88] sched/wake_q: Fix wakeup ordering for wake_q Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 37/88] futex: Fix (possible) missed wakeup Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 38/88] locking/rwsem: " Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 39/88] drm/amd/powerplay: OD setting fix on Vega10 Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 40/88] tty: serial: qcom_geni_serial: Allow mctrl when flow control is disabled Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 41/88] serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 42/88] drm/sun4i: hdmi: Fix usage of TMDS clock Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 43/88] staging: android: ion: Support cpu access during dma_buf_detach Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 44/88] direct-io: allow direct writes to empty inodes Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 45/88] writeback: synchronize sync(2) against cgroup writeback membership switches Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 46/88] scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 47/88] scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 48/88] scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 49/88] net: altera_tse: fix connect_local_phy error path Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 50/88] hv_netvsc: Fix ethtool change hash key error Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 51/88] hv_netvsc: Refactor assignments of struct netvsc_device_info Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 52/88] hv_netvsc: Fix hash key value reset after other ops Greg Kroah-Hartman
2019-03-04  8:22 ` Greg Kroah-Hartman [this message]
2019-03-04  8:22 ` [PATCH 4.20 54/88] nvme-multipath: drop optimization for static ANA group IDs Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 55/88] cifs: fix memory leak of an allocated cifs_ntsd structure Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 56/88] drm/msm: Fix A6XX support for opp-level Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 57/88] drm/msm: avoid unused function warning Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 58/88] net: usb: asix: ax88772_bind return error when hw_reset fail Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 59/88] net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 60/88] ibmveth: Do not process frames after calling napi_reschedule Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 61/88] mac80211: dont initiate TDLS connection if station is not associated to AP Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 62/88] mac80211: Add attribute aligned(2) to struct action Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 63/88] cfg80211: extend range deviation for DMG Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 64/88] svm: Fix AVIC incomplete IPI emulation Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 65/88] KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1 Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 66/88] kvm: selftests: Fix region overlap check in kvm_util Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 67/88] KVM: selftests: check returned evmcs version range Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 68/88] mmc: spi: Fix card detection during probe Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 69/88] mmc: tmio_mmc_core: dont claim spurious interrupts Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 70/88] mmc: tmio: fix access width of Block Count Register Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 71/88] mmc: core: Fix NULL ptr crash from mmc_should_fail_request Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 72/88] mmc: cqhci: fix space allocated for transfer descriptor Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 73/88] mmc: cqhci: Fix a tiny potential memory leak on error condition Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 74/88] mmc: sdhci-esdhc-imx: correct the fix of ERR004536 Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 75/88] mm: enforce min addr even if capable() in expand_downwards() Greg Kroah-Hartman
2019-03-04  8:22 ` [PATCH 4.20 76/88] drm: Block fb changes for async plane updates Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 77/88] hugetlbfs: fix races and page leaks during migration Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 78/88] crypto: ccree - add missing inline qualifier Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 79/88] MIPS: fix truncation in __cmpxchg_small for short values Greg Kroah-Hartman
2019-03-04  8:23   ` Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 80/88] MIPS: BCM63XX: provide DMA masks for ethernet devices Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 81/88] MIPS: fix memory setup for platforms with PHYS_OFFSET != 0 Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 82/88] scsi: 3w-sas: fix calls to dma_set_mask_and_coherent() Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 83/88] scsi: csiostor: " Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 84/88] scsi: 3w-9xxx: " Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 85/88] scsi: aic94xx: " Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 86/88] arm64: dts: qcom: msm8998: Extend TZ reserved memory area Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 87/88] MIPS: eBPF: Fix icache flush end address Greg Kroah-Hartman
2019-03-04  8:23 ` [PATCH 4.20 88/88] x86/uaccess: Dont leak the AC flag into __put_user() value evaluation Greg Kroah-Hartman
2019-03-04 20:36 ` [PATCH 4.20 00/88] 4.20.14-stable review Naresh Kamboju
2019-03-05  7:58   ` Greg Kroah-Hartman
2019-03-05  3:39 ` Guenter Roeck
2019-03-05  7:58   ` Greg Kroah-Hartman
2019-03-05 14:07 ` Jon Hunter
2019-03-05 14:07   ` Jon Hunter
2019-03-05 14:55   ` Greg Kroah-Hartman
2019-03-05 16:20 ` shuah
2019-03-05 16:51   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190304081632.640708219@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jalee@purestorage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sagi@grimberg.me \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.