public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Peter Mann <peter.mann@sh.cz>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 44/82] io_uring/rw: fix missing NOWAIT check for O_DIRECT start write
Date: Fri, 15 Nov 2024 07:38:21 +0100	[thread overview]
Message-ID: <20241115063727.146110254@linuxfoundation.org> (raw)
In-Reply-To: <20241115063725.561151311@linuxfoundation.org>

5.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jens Axboe <axboe@kernel.dk>

Commit 1d60d74e852647255bd8e76f5a22dc42531e4389 upstream.

When io_uring starts a write, it'll call kiocb_start_write() to bump the
super block rwsem, preventing any freezes from happening while that
write is in-flight. The freeze side will grab that rwsem for writing,
excluding any new writers from happening and waiting for existing writes
to finish. But io_uring unconditionally uses kiocb_start_write(), which
will block if someone is currently attempting to freeze the mount point.
This causes a deadlock where freeze is waiting for previous writes to
complete, but the previous writes cannot complete, as the task that is
supposed to complete them is blocked waiting on starting a new write.
This results in the following stuck trace showing that dependency with
the write blocked starting a new write:

task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
Call trace:
 __switch_to+0x1d8/0x348
 __schedule+0x8e8/0x2248
 schedule+0x110/0x3f0
 percpu_rwsem_wait+0x1e8/0x3f8
 __percpu_down_read+0xe8/0x500
 io_write+0xbb8/0xff8
 io_issue_sqe+0x10c/0x1020
 io_submit_sqes+0x614/0x2110
 __arm64_sys_io_uring_enter+0x524/0x1038
 invoke_syscall+0x74/0x268
 el0_svc_common.constprop.0+0x160/0x238
 do_el0_svc+0x44/0x60
 el0_svc+0x44/0xb0
 el0t_64_sync_handler+0x118/0x128
 el0t_64_sync+0x168/0x170
INFO: task fsfreeze:7364 blocked for more than 15 seconds.
      Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963

with the attempting freezer stuck trying to grab the rwsem:

task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
Call trace:
 __switch_to+0x1d8/0x348
 __schedule+0x8e8/0x2248
 schedule+0x110/0x3f0
 percpu_down_write+0x2b0/0x680
 freeze_super+0x248/0x8a8
 do_vfs_ioctl+0x149c/0x1b18
 __arm64_sys_ioctl+0xd0/0x1a0
 invoke_syscall+0x74/0x268
 el0_svc_common.constprop.0+0x160/0x238
 do_el0_svc+0x44/0x60
 el0_svc+0x44/0xb0
 el0t_64_sync_handler+0x118/0x128
 el0t_64_sync+0x168/0x170

Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
blocking grab of the super block rwsem if it isn't set. For normal issue
where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
have io_uring core issue a blocking attempt of the write. That will in
turn also get completions run, ensuring forward progress.

Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
something that can be triggered by a regular user.

Cc: stable@vger.kernel.org # 5.10+
Reported-by: Peter Mann <peter.mann@sh.cz>
Link: https://lore.kernel.org/io-uring/38c94aec-81c9-4f62-b44e-1d87f5597644@sh.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 io_uring/io_uring.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a6afdea5cfd8e..57c51e9638753 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3719,6 +3719,25 @@ static int io_write_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return io_prep_rw(req, sqe, WRITE);
 }
 
+static bool io_kiocb_start_write(struct io_kiocb *req, struct kiocb *kiocb)
+{
+	struct inode *inode;
+	bool ret;
+
+	if (!(req->flags & REQ_F_ISREG))
+		return true;
+	if (!(kiocb->ki_flags & IOCB_NOWAIT)) {
+		kiocb_start_write(kiocb);
+		return true;
+	}
+
+	inode = file_inode(kiocb->ki_filp);
+	ret = sb_start_write_trylock(inode->i_sb);
+	if (ret)
+		__sb_writers_release(inode->i_sb, SB_FREEZE_WRITE);
+	return ret;
+}
+
 static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 {
 	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
@@ -3765,8 +3784,8 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	if (unlikely(ret))
 		goto out_free;
 
-	if (req->flags & REQ_F_ISREG)
-		kiocb_start_write(kiocb);
+	if (unlikely(!io_kiocb_start_write(req, kiocb)))
+		goto copy_iov;
 	kiocb->ki_flags |= IOCB_WRITE;
 
 	if (req->file->f_op->write_iter)
-- 
2.43.0




  parent reply	other threads:[~2024-11-15  6:56 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-15  6:37 [PATCH 5.10 00/82] 5.10.230-rc1 review Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 01/82] arm64: dts: rockchip: Fix rt5651 compatible value on rk3399-sapphire-excavator Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 02/82] arm64: dts: rockchip: Remove hdmis 2nd interrupt on rk3328 Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 03/82] arm64: dts: rockchip: Fix bluetooth properties on Rock960 boards Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 04/82] arm64: dts: rockchip: Remove #cooling-cells from fan on Theobroma lion Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 05/82] arm64: dts: rockchip: Fix LED triggers on rk3308-roc-cc Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 06/82] arm64: dts: imx8mp: correct sdhc ipg clk Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 07/82] ARM: dts: rockchip: fix rk3036 acodec node Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 08/82] ARM: dts: rockchip: drop grf reference from rk3036 hdmi Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 09/82] ARM: dts: rockchip: Fix the spi controller on rk3036 Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 10/82] ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 11/82] HID: core: zero-initialize the report buffer Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 12/82] security/keys: fix slab-out-of-bounds in key_task_permission Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 13/82] net: enetc: set MAC address to the VF net_device Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 14/82] sctp: properly validate chunk size in sctp_sf_ootb() Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 15/82] can: c_can: fix {rx,tx}_errors statistics Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 16/82] net: hns3: fix kernel crash when uninstalling driver Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 17/82] net: phy: export phy_error and phy_trigger_machine Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 18/82] net: phy: ti: implement generic .handle_interrupt() callback Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 19/82] net: phy: ti: add PHY_RST_AFTER_CLK_EN flag Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 20/82] net: arc: fix the device for dma_map_single/dma_unmap_single Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 21/82] Revert "ALSA: hda/conexant: Mute speakers at suspend / shutdown" Greg Kroah-Hartman
2024-11-15  6:37 ` [PATCH 5.10 22/82] media: stb0899_algo: initialize cfr before using it Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 23/82] media: dvbdev: prevent the risk of out of memory access Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 24/82] media: dvb_frontend: dont play tricks with underflow values Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 25/82] media: adv7604: prevent underflow condition when reporting colorspace Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 26/82] scsi: sd_zbc: Use kvzalloc() to allocate REPORT ZONES buffer Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 27/82] ALSA: firewire-lib: fix return value on fail in amdtp_tscm_init() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 28/82] ASoC: stm32: spdifrx: fix dma channel release in stm32_spdifrx_remove Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 29/82] media: s5p-jpeg: prevent buffer overflows Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 30/82] media: cx24116: prevent overflows on SNR calculus Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 31/82] media: pulse8-cec: fix data timestamp at pulse8_setup() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 32/82] media: v4l2-tpg: prevent the risk of a division by zero Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 33/82] pwm: imx-tpm: Use correct MODULO value for EPWM mode Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 34/82] drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 35/82] drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 36/82] dm cache: correct the number of origin blocks to match the target length Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 37/82] dm cache: fix out-of-bounds access to the dirty bitset when resizing Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 38/82] dm cache: optimize dirty bit checking with find_next_bit " Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 39/82] dm cache: fix potential out-of-bounds access on the first resume Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 40/82] dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 41/82] io_uring: rename kiocb_end_write() local helper Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 42/82] fs: create kiocb_{start,end}_write() helpers Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 43/82] io_uring: use " Greg Kroah-Hartman
2024-11-15  6:38 ` Greg Kroah-Hartman [this message]
2024-11-15  6:38 ` [PATCH 5.10 45/82] nfs: Fix KMSAN warning in decode_getfattr_attrs() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 46/82] btrfs: reinitialize delayed ref list after deleting it from the list Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 47/82] splice: dont generate zero-len segement bvecs Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 48/82] spi: Fix deadlock when adding SPI controllers on SPI buses Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 49/82] spi: fix use-after-free of the add_lock mutex Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 50/82] net: bridge: xmit: make sure we have at least eth header len bytes Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 51/82] Revert "perf hist: Add missing puts to hist__account_cycles" Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 52/82] perf session: Add missing evlist__delete when deleting a session Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 53/82] net: do not delay dst_entries_add() in dst_release() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 54/82] media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 55/82] fs/proc: fix compile warning about variable vmcore_mmap_ops Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 56/82] usb: musb: sunxi: Fix accessing an released usb phy Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 57/82] usb: typec: fix potential out of bounds in ucsi_ccg_update_set_new_cam_cmd() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 58/82] USB: serial: io_edgeport: fix use after free in debug printk Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 59/82] USB: serial: qcserial: add support for Sierra Wireless EM86xx Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 60/82] USB: serial: option: add Fibocom FG132 0x0112 composition Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 61/82] USB: serial: option: add Quectel RG650V Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 62/82] irqchip/gic-v3: Force propagation of the active state with a read-back Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 63/82] ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 64/82] ALSA: usb-audio: Support jack detection on Dell dock Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 65/82] ALSA: usb-audio: Add quirks for Dell WD19 dock Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 66/82] hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 67/82] vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 68/82] ALSA: usb-audio: Add endianness annotations Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 69/82] net: phy: ti: take into account all possible interrupt sources Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 70/82] 9p: Avoid creating multiple slab caches with the same name Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 71/82] HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 72/82] bpf: use kvzmalloc to allocate BPF verifier environment Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 73/82] crypto: marvell/cesa - Disable hash algorithms Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 74/82] sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 75/82] powerpc/powernv: Free name on error in opal_event_init() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 76/82] vDPA/ifcvf: Fix pci_read_config_byte() return code handling Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 77/82] fs: Fix uninitialized value issue in from_kuid and from_kgid Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 78/82] net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 79/82] md/raid10: improve code of mrdev in raid10_sync_request Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 80/82] io_uring: fix possible deadlock in io_register_iowq_max_workers() Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 81/82] mm: krealloc: Fix MTE false alarm in __do_krealloc Greg Kroah-Hartman
2024-11-15  6:38 ` [PATCH 5.10 82/82] 9p: fix slab cache name creation for real Greg Kroah-Hartman
2024-11-15  9:58 ` [PATCH 5.10 00/82] 5.10.230-rc1 review Dominique Martinet
2024-11-15 18:08 ` Jon Hunter
2024-11-15 18:59 ` Florian Fainelli
2024-11-15 21:27 ` Mark Brown
2024-11-16 12:51 ` Naresh Kamboju
2024-11-17 13:28 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241115063727.146110254@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=patches@lists.linux.dev \
    --cc=peter.mann@sh.cz \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox