public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: David Jeffery <djeffery@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Keith Busch <kbusch@kernel.org>,
	virtualization@lists.linux-foundation.org,
	Bart Van Assche <bvanassche@acm.org>,
	Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Sasha Levin <sashal@kernel.org>,
	linux-block@vger.kernel.org
Subject: [PATCH AUTOSEL 6.0 14/73] blk-mq: avoid double ->queue_rq() because of early timeout
Date: Sun, 18 Dec 2022 11:06:42 -0500	[thread overview]
Message-ID: <20221218160741.927862-14-sashal@kernel.org> (raw)
In-Reply-To: <20221218160741.927862-1-sashal@kernel.org>

From: David Jeffery <djeffery@redhat.com>

[ Upstream commit 82c229476b8f6afd7e09bc4dc77d89dc19ff7688 ]

David Jeffery found one double ->queue_rq() issue, so far it can
be triggered in VM use case because of long vmexit latency or preempt
latency of vCPU pthread or long page fault in vCPU pthread, then block
IO req could be timed out before queuing the request to hardware but after
calling blk_mq_start_request() during ->queue_rq(), then timeout handler
may handle it by requeue, then double ->queue_rq() is caused, and kernel
panic.

So far, it is driver's responsibility to cover the race between timeout
and completion, so it seems supposed to be solved in driver in theory,
given driver has enough knowledge.

But it is really one common problem, lots of driver could have similar
issue, and could be hard to fix all affected drivers, even it isn't easy
for driver to handle the race. So David suggests this patch by draining
in-progress ->queue_rq() for solving this issue.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: virtualization@lists.linux-foundation.org
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221026051957.358818-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-mq.c | 56 +++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3f1f5e3e0951..1a30f6580274 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1442,7 +1442,13 @@ static void blk_mq_rq_timed_out(struct request *req)
 	blk_add_timer(req);
 }
 
-static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
+struct blk_expired_data {
+	bool has_timedout_rq;
+	unsigned long next;
+	unsigned long timeout_start;
+};
+
+static bool blk_mq_req_expired(struct request *rq, struct blk_expired_data *expired)
 {
 	unsigned long deadline;
 
@@ -1452,13 +1458,13 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
 		return false;
 
 	deadline = READ_ONCE(rq->deadline);
-	if (time_after_eq(jiffies, deadline))
+	if (time_after_eq(expired->timeout_start, deadline))
 		return true;
 
-	if (*next == 0)
-		*next = deadline;
-	else if (time_after(*next, deadline))
-		*next = deadline;
+	if (expired->next == 0)
+		expired->next = deadline;
+	else if (time_after(expired->next, deadline))
+		expired->next = deadline;
 	return false;
 }
 
@@ -1472,7 +1478,7 @@ void blk_mq_put_rq_ref(struct request *rq)
 
 static bool blk_mq_check_expired(struct request *rq, void *priv)
 {
-	unsigned long *next = priv;
+	struct blk_expired_data *expired = priv;
 
 	/*
 	 * blk_mq_queue_tag_busy_iter() has locked the request, so it cannot
@@ -1481,7 +1487,18 @@ static bool blk_mq_check_expired(struct request *rq, void *priv)
 	 * it was completed and reallocated as a new request after returning
 	 * from blk_mq_check_expired().
 	 */
-	if (blk_mq_req_expired(rq, next))
+	if (blk_mq_req_expired(rq, expired)) {
+		expired->has_timedout_rq = true;
+		return false;
+	}
+	return true;
+}
+
+static bool blk_mq_handle_expired(struct request *rq, void *priv)
+{
+	struct blk_expired_data *expired = priv;
+
+	if (blk_mq_req_expired(rq, expired))
 		blk_mq_rq_timed_out(rq);
 	return true;
 }
@@ -1490,7 +1507,9 @@ static void blk_mq_timeout_work(struct work_struct *work)
 {
 	struct request_queue *q =
 		container_of(work, struct request_queue, timeout_work);
-	unsigned long next = 0;
+	struct blk_expired_data expired = {
+		.timeout_start = jiffies,
+	};
 	struct blk_mq_hw_ctx *hctx;
 	unsigned long i;
 
@@ -1510,10 +1529,23 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return;
 
-	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
+	/* check if there is any timed-out request */
+	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &expired);
+	if (expired.has_timedout_rq) {
+		/*
+		 * Before walking tags, we must ensure any submit started
+		 * before the current time has finished. Since the submit
+		 * uses srcu or rcu, wait for a synchronization point to
+		 * ensure all running submits have finished
+		 */
+		blk_mq_wait_quiesce_done(q);
+
+		expired.next = 0;
+		blk_mq_queue_tag_busy_iter(q, blk_mq_handle_expired, &expired);
+	}
 
-	if (next != 0) {
-		mod_timer(&q->timeout, next);
+	if (expired.next != 0) {
+		mod_timer(&q->timeout, expired.next);
 	} else {
 		/*
 		 * Request timeouts are handled as a forward rolling timer. If
-- 
2.35.1


  parent reply	other threads:[~2022-12-18 16:24 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-18 16:06 [PATCH AUTOSEL 6.0 01/73] drm/etnaviv: add missing quirks for GC300 Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 02/73] media: imx-jpeg: Disable useless interrupt to avoid kernel panic Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 03/73] brcmfmac: return error when getting invalid max_flowrings from dongle Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 04/73] wifi: ath9k: verify the expected usb_endpoints are present Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 05/73] wifi: ar5523: Fix use-after-free on ar5523_cmd() timed out Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 06/73] ASoC: codecs: rt298: Add quirk for KBL-R RVP platform Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 07/73] ASoC: Intel: avs: " Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 08/73] ipmi: fix memleak when unload ipmi driver Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 09/73] wifi: ath10k: Delay the unmapping of the buffer Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 10/73] drm/amd/display: prevent memory leak Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 11/73] drm/edid: add a quirk for two LG monitors to get them to work on 10bpc Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 12/73] Revert "drm/amd/display: Limit max DSC target bpp for specific monitors" Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 13/73] drm/rockchip: use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() Sasha Levin
2022-12-18 16:06 ` Sasha Levin [this message]
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 15/73] HID: apple: fix key translations where multiple quirks attempt to translate the same key Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 16/73] HID: apple: enable APPLE_ISO_TILDE_QUIRK for the keyboards of Macs with the T2 chip Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 17/73] wifi: ath11k: Fix qmi_msg_handler data structure initialization Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 18/73] qed (gcc13): use u16 for fid to be big enough Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 19/73] drm/meson: Fix return type of meson_encoder_cvbs_mode_valid() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 20/73] bpf: make sure skb->len != 0 when redirecting to a tunneling device Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 21/73] net: ethernet: ti: Fix return type of netcp_ndo_start_xmit() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 22/73] hamradio: baycom_epp: Fix return type of baycom_send_packet() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 23/73] wifi: brcmfmac: Fix potential shift-out-of-bounds in brcmf_fw_alloc_request() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 24/73] HID: input: do not query XP-PEN Deco LW battery Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 25/73] igb: Do not free q_vector unless new one was allocated Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 26/73] drm/amdgpu: Fix type of second parameter in trans_msg() callback Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 27/73] drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 28/73] s390/ctcm: Fix return type of ctc{mp,}m_tx() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 29/73] s390/netiucv: Fix return type of netiucv_tx() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 30/73] s390/lcs: Fix return type of lcs_start_xmit() Sasha Levin
2022-12-18 16:06 ` [PATCH AUTOSEL 6.0 31/73] drm/amd/display: Disable DRR actions during state commit Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 32/73] drm/msm: Use drm_mode_copy() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 33/73] drm/rockchip: " Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 34/73] drm/sti: " Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 35/73] drm/mediatek: Fix return type of mtk_hdmi_bridge_mode_valid() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 36/73] drivers/md/md-bitmap: check the return value of md_bitmap_get_counter() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 37/73] md/raid0, raid10: Don't set discard sectors for request queue Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 38/73] md/raid1: stop mdx_raid1 thread when raid1 array run failed Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 39/73] drm/amd/display: Workaround to increase phantom pipe vactive in pipesplit Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 40/73] drm/amd/display: fix array index out of bound error in bios parser Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 41/73] nvme-auth: don't override ctrl keys before validation Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 42/73] net: add atomic_long_t to net_device_stats fields Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 43/73] ipv6/sit: use DEV_STATS_INC() to avoid data-races Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 44/73] mrp: introduce active flags to prevent UAF when applicant uninit Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 45/73] net: ethernet: mtk_eth_soc: drop packets to WDMA if the ring is full Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 46/73] ppp: associate skb with a device at tx Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 47/73] bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 48/73] bpf: Prevent decl_tag from being referenced in func_proto arg Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 49/73] ethtool: avoiding integer overflow in ethtool_phys_id() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 50/73] media: dvb-frontends: fix leak of memory fw Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 51/73] media: dvbdev: adopts refcnt to avoid UAF Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 52/73] media: dvb-usb: fix memory leak in dvb_usb_adapter_init() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 53/73] media: mediatek: vcodec: Can't set dst buffer to done when lat decode error Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 54/73] blk-mq: fix possible memleak when register 'hctx' failed Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 55/73] drm/amd/display: Use the largest vready_offset in pipe group Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 56/73] drm/amd/display: Fix DTBCLK disable requests and SRC_SEL programming Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 57/73] ASoC: amd: yc: Add Xiaomi Redmi Book Pro 14 2022 into DMI table Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 58/73] libbpf: Avoid enum forward-declarations in public API in C++ mode Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 59/73] regulator: core: fix use_count leakage when handling boot-on Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 60/73] net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missing Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 61/73] wifi: mt76: do not run mt76u_status_worker if the device is not running Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 62/73] hwmon: (nct6775) add ASUS CROSSHAIR VIII/TUF/ProArt B550M Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 63/73] selftests/bpf: Fix conflicts with built-in functions in bpf_iter_ksym Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 64/73] nfs: fix possible null-ptr-deref when parsing param Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 65/73] mmc: f-sdh30: Add quirks for broken timeout clock capability Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 66/73] mmc: renesas_sdhi: add quirk for broken register layout Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 67/73] mmc: renesas_sdhi: better reset from HS400 mode Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 68/73] media: si470x: Fix use-after-free in si470x_int_in_callback() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 69/73] clk: st: Fix memory leak in st_of_quadfs_setup() Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 70/73] regulator: core: Use different devices for resource allocation and DT lookup Sasha Levin
2022-12-19  1:07   ` ChiYuan Huang
2022-12-24  0:35     ` Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 71/73] Bluetooth: hci_bcm: Add CYW4373A0 support Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 72/73] Bluetooth: Add quirk to disable extended scanning Sasha Levin
2022-12-18 16:07 ` [PATCH AUTOSEL 6.0 73/73] Bluetooth: Add quirk to disable MWS Transport Configuration Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221218160741.927862-14-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=djeffery@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox