From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Peter Mann <peter.mann@sh.cz>,
Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 44/82] io_uring/rw: fix missing NOWAIT check for O_DIRECT start write
Date: Fri, 15 Nov 2024 07:38:21 +0100 [thread overview]
Message-ID: <20241115063727.146110254@linuxfoundation.org> (raw)
In-Reply-To: <20241115063725.561151311@linuxfoundation.org>
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe <axboe@kernel.dk>
Commit 1d60d74e852647255bd8e76f5a22dc42531e4389 upstream.
When io_uring starts a write, it'll call kiocb_start_write() to bump the
super block rwsem, preventing any freezes from happening while that
write is in-flight. The freeze side will grab that rwsem for writing,
excluding any new writers from happening and waiting for existing writes
to finish. But io_uring unconditionally uses kiocb_start_write(), which
will block if someone is currently attempting to freeze the mount point.
This causes a deadlock where freeze is waiting for previous writes to
complete, but the previous writes cannot complete, as the task that is
supposed to complete them is blocked waiting on starting a new write.
This results in the following stuck trace showing that dependency with
the write blocked starting a new write:
task:fio state:D stack:0 pid:886 tgid:886 ppid:876
Call trace:
__switch_to+0x1d8/0x348
__schedule+0x8e8/0x2248
schedule+0x110/0x3f0
percpu_rwsem_wait+0x1e8/0x3f8
__percpu_down_read+0xe8/0x500
io_write+0xbb8/0xff8
io_issue_sqe+0x10c/0x1020
io_submit_sqes+0x614/0x2110
__arm64_sys_io_uring_enter+0x524/0x1038
invoke_syscall+0x74/0x268
el0_svc_common.constprop.0+0x160/0x238
do_el0_svc+0x44/0x60
el0_svc+0x44/0xb0
el0t_64_sync_handler+0x118/0x128
el0t_64_sync+0x168/0x170
INFO: task fsfreeze:7364 blocked for more than 15 seconds.
Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963
with the attempting freezer stuck trying to grab the rwsem:
task:fsfreeze state:D stack:0 pid:7364 tgid:7364 ppid:995
Call trace:
__switch_to+0x1d8/0x348
__schedule+0x8e8/0x2248
schedule+0x110/0x3f0
percpu_down_write+0x2b0/0x680
freeze_super+0x248/0x8a8
do_vfs_ioctl+0x149c/0x1b18
__arm64_sys_ioctl+0xd0/0x1a0
invoke_syscall+0x74/0x268
el0_svc_common.constprop.0+0x160/0x238
do_el0_svc+0x44/0x60
el0_svc+0x44/0xb0
el0t_64_sync_handler+0x118/0x128
el0t_64_sync+0x168/0x170
Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
blocking grab of the super block rwsem if it isn't set. For normal issue
where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
have io_uring core issue a blocking attempt of the write. That will in
turn also get completions run, ensuring forward progress.
Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
something that can be triggered by a regular user.
Cc: stable@vger.kernel.org # 5.10+
Reported-by: Peter Mann <peter.mann@sh.cz>
Link: https://lore.kernel.org/io-uring/38c94aec-81c9-4f62-b44e-1d87f5597644@sh.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
io_uring/io_uring.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a6afdea5cfd8e..57c51e9638753 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3719,6 +3719,25 @@ static int io_write_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
return io_prep_rw(req, sqe, WRITE);
}
+static bool io_kiocb_start_write(struct io_kiocb *req, struct kiocb *kiocb)
+{
+ struct inode *inode;
+ bool ret;
+
+ if (!(req->flags & REQ_F_ISREG))
+ return true;
+ if (!(kiocb->ki_flags & IOCB_NOWAIT)) {
+ kiocb_start_write(kiocb);
+ return true;
+ }
+
+ inode = file_inode(kiocb->ki_filp);
+ ret = sb_start_write_trylock(inode->i_sb);
+ if (ret)
+ __sb_writers_release(inode->i_sb, SB_FREEZE_WRITE);
+ return ret;
+}
+
static int io_write(struct io_kiocb *req, unsigned int issue_flags)
{
struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
@@ -3765,8 +3784,8 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
if (unlikely(ret))
goto out_free;
- if (req->flags & REQ_F_ISREG)
- kiocb_start_write(kiocb);
+ if (unlikely(!io_kiocb_start_write(req, kiocb)))
+ goto copy_iov;
kiocb->ki_flags |= IOCB_WRITE;
if (req->file->f_op->write_iter)
--
2.43.0
next prev parent reply other threads:[~2024-11-15 6:56 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-15 6:37 [PATCH 5.10 00/82] 5.10.230-rc1 review Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 01/82] arm64: dts: rockchip: Fix rt5651 compatible value on rk3399-sapphire-excavator Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 02/82] arm64: dts: rockchip: Remove hdmis 2nd interrupt on rk3328 Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 03/82] arm64: dts: rockchip: Fix bluetooth properties on Rock960 boards Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 04/82] arm64: dts: rockchip: Remove #cooling-cells from fan on Theobroma lion Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 05/82] arm64: dts: rockchip: Fix LED triggers on rk3308-roc-cc Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 06/82] arm64: dts: imx8mp: correct sdhc ipg clk Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 07/82] ARM: dts: rockchip: fix rk3036 acodec node Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 08/82] ARM: dts: rockchip: drop grf reference from rk3036 hdmi Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 09/82] ARM: dts: rockchip: Fix the spi controller on rk3036 Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 10/82] ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 11/82] HID: core: zero-initialize the report buffer Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 12/82] security/keys: fix slab-out-of-bounds in key_task_permission Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 13/82] net: enetc: set MAC address to the VF net_device Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 14/82] sctp: properly validate chunk size in sctp_sf_ootb() Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 15/82] can: c_can: fix {rx,tx}_errors statistics Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 16/82] net: hns3: fix kernel crash when uninstalling driver Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 17/82] net: phy: export phy_error and phy_trigger_machine Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 18/82] net: phy: ti: implement generic .handle_interrupt() callback Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 19/82] net: phy: ti: add PHY_RST_AFTER_CLK_EN flag Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 20/82] net: arc: fix the device for dma_map_single/dma_unmap_single Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 21/82] Revert "ALSA: hda/conexant: Mute speakers at suspend / shutdown" Greg Kroah-Hartman
2024-11-15 6:37 ` [PATCH 5.10 22/82] media: stb0899_algo: initialize cfr before using it Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 23/82] media: dvbdev: prevent the risk of out of memory access Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 24/82] media: dvb_frontend: dont play tricks with underflow values Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 25/82] media: adv7604: prevent underflow condition when reporting colorspace Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 26/82] scsi: sd_zbc: Use kvzalloc() to allocate REPORT ZONES buffer Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 27/82] ALSA: firewire-lib: fix return value on fail in amdtp_tscm_init() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 28/82] ASoC: stm32: spdifrx: fix dma channel release in stm32_spdifrx_remove Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 29/82] media: s5p-jpeg: prevent buffer overflows Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 30/82] media: cx24116: prevent overflows on SNR calculus Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 31/82] media: pulse8-cec: fix data timestamp at pulse8_setup() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 32/82] media: v4l2-tpg: prevent the risk of a division by zero Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 33/82] pwm: imx-tpm: Use correct MODULO value for EPWM mode Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 34/82] drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 35/82] drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 36/82] dm cache: correct the number of origin blocks to match the target length Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 37/82] dm cache: fix out-of-bounds access to the dirty bitset when resizing Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 38/82] dm cache: optimize dirty bit checking with find_next_bit " Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 39/82] dm cache: fix potential out-of-bounds access on the first resume Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 40/82] dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 41/82] io_uring: rename kiocb_end_write() local helper Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 42/82] fs: create kiocb_{start,end}_write() helpers Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 43/82] io_uring: use " Greg Kroah-Hartman
2024-11-15 6:38 ` Greg Kroah-Hartman [this message]
2024-11-15 6:38 ` [PATCH 5.10 45/82] nfs: Fix KMSAN warning in decode_getfattr_attrs() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 46/82] btrfs: reinitialize delayed ref list after deleting it from the list Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 47/82] splice: dont generate zero-len segement bvecs Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 48/82] spi: Fix deadlock when adding SPI controllers on SPI buses Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 49/82] spi: fix use-after-free of the add_lock mutex Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 50/82] net: bridge: xmit: make sure we have at least eth header len bytes Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 51/82] Revert "perf hist: Add missing puts to hist__account_cycles" Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 52/82] perf session: Add missing evlist__delete when deleting a session Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 53/82] net: do not delay dst_entries_add() in dst_release() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 54/82] media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 55/82] fs/proc: fix compile warning about variable vmcore_mmap_ops Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 56/82] usb: musb: sunxi: Fix accessing an released usb phy Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 57/82] usb: typec: fix potential out of bounds in ucsi_ccg_update_set_new_cam_cmd() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 58/82] USB: serial: io_edgeport: fix use after free in debug printk Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 59/82] USB: serial: qcserial: add support for Sierra Wireless EM86xx Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 60/82] USB: serial: option: add Fibocom FG132 0x0112 composition Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 61/82] USB: serial: option: add Quectel RG650V Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 62/82] irqchip/gic-v3: Force propagation of the active state with a read-back Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 63/82] ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 64/82] ALSA: usb-audio: Support jack detection on Dell dock Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 65/82] ALSA: usb-audio: Add quirks for Dell WD19 dock Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 66/82] hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 67/82] vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 68/82] ALSA: usb-audio: Add endianness annotations Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 69/82] net: phy: ti: take into account all possible interrupt sources Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 70/82] 9p: Avoid creating multiple slab caches with the same name Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 71/82] HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 72/82] bpf: use kvzmalloc to allocate BPF verifier environment Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 73/82] crypto: marvell/cesa - Disable hash algorithms Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 74/82] sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 75/82] powerpc/powernv: Free name on error in opal_event_init() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 76/82] vDPA/ifcvf: Fix pci_read_config_byte() return code handling Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 77/82] fs: Fix uninitialized value issue in from_kuid and from_kgid Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 78/82] net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 79/82] md/raid10: improve code of mrdev in raid10_sync_request Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 80/82] io_uring: fix possible deadlock in io_register_iowq_max_workers() Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 81/82] mm: krealloc: Fix MTE false alarm in __do_krealloc Greg Kroah-Hartman
2024-11-15 6:38 ` [PATCH 5.10 82/82] 9p: fix slab cache name creation for real Greg Kroah-Hartman
2024-11-15 9:58 ` [PATCH 5.10 00/82] 5.10.230-rc1 review Dominique Martinet
2024-11-15 18:08 ` Jon Hunter
2024-11-15 18:59 ` Florian Fainelli
2024-11-15 21:27 ` Mark Brown
2024-11-16 12:51 ` Naresh Kamboju
2024-11-17 13:28 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241115063727.146110254@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=patches@lists.linux.dev \
--cc=peter.mann@sh.cz \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.