patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Keith Busch <kbusch@kernel.org>,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 6.1 63/64] block: treat poll queue enter similarly to timeouts
Date: Tue, 13 Feb 2024 18:21:49 +0100	[thread overview]
Message-ID: <20240213171846.717757238@linuxfoundation.org> (raw)
In-Reply-To: <20240213171844.702064831@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jens Axboe <axboe@kernel.dk>

commit 33391eecd63158536fb5257fee5be3a3bdc30e3c upstream.

We ran into an issue where a production workload would randomly grind to
a halt and not continue until the pending IO had timed out. This turned
out to be a complicated interaction between queue freezing and polled
IO:

1) You have an application that does polled IO. At any point in time,
   there may be polled IO pending.

2) You have a monitoring application that issues a passthrough command,
   which is marked with side effects such that it needs to freeze the
   queue.

3) Passthrough command is started, which calls blk_freeze_queue_start()
   on the device. At this point the queue is marked frozen, and any
   attempt to enter the queue will fail (for non-blocking) or block.

4) Now the driver calls blk_mq_freeze_queue_wait(), which will return
   when the queue is quiesced and pending IO has completed.

5) The pending IO is polled IO, but any attempt to poll IO through the
   normal iocb_bio_iopoll() -> bio_poll() will fail when it gets to
   bio_queue_enter() as the queue is frozen. Rather than poll and
   complete IO, the polling threads will sit in a tight loop attempting
   to poll, but failing to enter the queue to do so.

The end result is that progress for either application will be stalled
until all pending polled IO has timed out. This causes obvious huge
latency issues for the application doing polled IO, but also long delays
for passthrough command.

Fix this by treating queue enter for polled IO just like we do for
timeouts. This allows quick quiesce of the queue as we still poll and
complete this IO, while still disallowing queueing up new IO.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/blk-core.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -864,7 +864,16 @@ int bio_poll(struct bio *bio, struct io_
 	 */
 	blk_flush_plug(current->plug, false);
 
-	if (bio_queue_enter(bio))
+	/*
+	 * We need to be able to enter a frozen queue, similar to how
+	 * timeouts also need to do that. If that is blocked, then we can
+	 * have pending IO when a queue freeze is started, and then the
+	 * wait for the freeze to finish will wait for polled requests to
+	 * timeout as the poller is preventer from entering the queue and
+	 * completing them. As long as we prevent new IO from being queued,
+	 * that should be all that matters.
+	 */
+	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return 0;
 	if (queue_is_mq(q)) {
 		ret = blk_mq_poll(q, cookie, iob, flags);



  parent reply	other threads:[~2024-02-13 17:26 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 17:20 [PATCH 6.1 00/64] 6.1.78-rc1 review Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 01/64] ext4: regenerate buddy after block freeing failed if under fc replay Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 02/64] dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 03/64] dmaengine: ti: k3-udma: Report short packet errors Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 04/64] dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 05/64] dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 06/64] phy: renesas: rcar-gen3-usb2: Fix returning wrong error code Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 07/64] dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 08/64] phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 09/64] cifs: failure to add channel on iface should bump up weight Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 10/64] drm/msms/dp: fixed link clock divider bits be over written in BPC unknown case Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 11/64] drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 12/64] drm/msm/dpu: check for valid hw_pp in dpu_encoder_helper_phys_cleanup Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 13/64] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 14/64] wifi: mac80211: fix waiting for beacons logic Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 15/64] netdevsim: avoid potential loop in nsim_dev_trap_report_work() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 16/64] net: atlantic: Fix DMA mapping for PTP hwts ring Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 17/64] selftests: net: cut more slack for gro fwd tests Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 18/64] selftests: net: avoid just another constant wait Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 19/64] tunnels: fix out of bounds access when building IPv6 PMTU error Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 20/64] atm: idt77252: fix a memleak in open_card_ubr0 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 21/64] octeontx2-pf: Fix a memleak otx2_sq_init Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 22/64] hwmon: (aspeed-pwm-tacho) mutex for tach reading Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 23/64] hwmon: (coretemp) Fix out-of-bounds memory access Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 24/64] hwmon: (coretemp) Fix bogus core_id to attr name mapping Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 25/64] inet: read sk->sk_family once in inet_recv_error() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 26/64] drm/i915/gvt: Fix uninitialized variable in handle_mmio() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 27/64] rxrpc: Fix response to PING RESPONSE ACKs to a dead call Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 28/64] tipc: Check the bearer type before calling tipc_udp_nl_bearer_add() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 29/64] af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 30/64] ppp_async: limit MRU to 64K Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 31/64] selftests: cmsg_ipv6: repeat the exact packet Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 32/64] netfilter: nft_compat: narrow down revision to unsigned 8-bits Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 33/64] netfilter: nft_compat: reject unused compat flag Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 34/64] netfilter: nft_compat: restrict match/target protocol to u16 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 35/64] drm/amd/display: Implement bounds check for stream encoder creation in DCN301 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 36/64] netfilter: nft_ct: reject direction for ct id Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 37/64] netfilter: nft_set_pipapo: store index in scratch maps Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 38/64] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 39/64] netfilter: nft_set_pipapo: remove scratch_aligned pointer Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 40/64] fs/ntfs3: Fix an NULL dereference bug Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 41/64] scsi: core: Move scsi_host_busy() out of host lock if it is for per-command Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 42/64] blk-iocost: Fix an UBSAN shift-out-of-bounds warning Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 43/64] fs: dlm: dont put dlm_local_addrs on heap Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 44/64] mtd: parsers: ofpart: add workaround for #size-cells 0 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 45/64] ALSA: usb-audio: Add delay quirk for MOTU M Series 2nd revision Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 46/64] ALSA: usb-audio: Add a quirk for Yamaha YIT-W12TX transmitter Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 47/64] ALSA: usb-audio: add quirk for RODE NT-USB+ Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 48/64] USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 49/64] USB: serial: option: add Fibocom FM101-GL variant Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 50/64] USB: serial: cp210x: add ID for IMST iM871A-USB Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 51/64] usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 52/64] usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 53/64] xhci: process isoc TD properly when there was a transaction error mid TD Greg Kroah-Hartman
2024-02-13 18:47   ` Michał Pecio
2024-02-14 14:23     ` Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 54/64] xhci: handle isoc Babble and Buffer Overrun events properly Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 55/64] hrtimer: Report offline hrtimer enqueue Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 56/64] Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 57/64] Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 58/64] io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 59/64] Revert "ASoC: amd: Add new dmi entries for acp5x platform" Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 60/64] vhost: use kzalloc() instead of kmalloc() followed by memset() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 61/64] RDMA/irdma: Fix support for 64k pages Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 62/64] f2fs: add helper to check compression level Greg Kroah-Hartman
2024-02-13 17:21 ` Greg Kroah-Hartman [this message]
2024-02-13 17:21 ` [PATCH 6.1 64/64] clocksource: Skip watchdog check for large watchdog intervals Greg Kroah-Hartman
2024-02-13 19:03 ` [PATCH 6.1 00/64] 6.1.78-rc1 review SeongJae Park
2024-02-13 19:34 ` Pavel Machek
2024-02-13 21:04 ` Kelsey Steele
2024-02-13 21:15 ` Florian Fainelli
2024-02-13 22:46 ` Allen
2024-02-14  0:17 ` Shuah Khan
2024-02-14  9:03 ` Jon Hunter
2024-02-14 13:08   ` Greg Kroah-Hartman
2024-02-14 13:15     ` Jon Hunter
2024-02-14 14:23       ` Greg Kroah-Hartman
2024-02-14 11:07 ` Naresh Kamboju
2024-02-14 11:22 ` Yann Sionneau
2024-02-14 16:44 ` Sven Joachim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240213171846.717757238@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=kbusch@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).