All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Caleb Sander Mateos <csander@purestorage.com>,
	syzbot ci <syzbot+ci6d21afd0455de45a@syzkaller.appspotmail.com>
Cc: io-uring@vger.kernel.org, joannelkoong@gmail.com,
	linux-kernel@vger.kernel.org, oliver.sang@intel.com,
	syzbot@syzkaller.appspotmail.com, syzbot@lists.linux.dev,
	syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot ci] Re: io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
Date: Sun, 18 Jan 2026 11:34:15 -0700	[thread overview]
Message-ID: <0bc36797-fe4e-46ba-933d-0b3d508ed0dd@kernel.dk> (raw)
In-Reply-To: <CADUfDZq7MK3r6c05CohT0hMowq-gqffGid-eC1cDGKy+4aaS=A@mail.gmail.com>

On 12/22/25 1:19 PM, Caleb Sander Mateos wrote:
> On Thu, Dec 18, 2025 at 3:01?AM syzbot ci
> <syzbot+ci6d21afd0455de45a@syzkaller.appspotmail.com> wrote:
>>
>> syzbot ci has tested the following series
>>
>> [v6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
>> https://lore.kernel.org/all/20251218024459.1083572-1-csander@purestorage.com
>> * [PATCH v6 1/6] io_uring: use release-acquire ordering for IORING_SETUP_R_DISABLED
>> * [PATCH v6 2/6] io_uring: clear IORING_SETUP_SINGLE_ISSUER for IORING_SETUP_SQPOLL
>> * [PATCH v6 3/6] io_uring: ensure submitter_task is valid for io_ring_ctx's lifetime
>> * [PATCH v6 4/6] io_uring: use io_ring_submit_lock() in io_iopoll_req_issued()
>> * [PATCH v6 5/6] io_uring: factor out uring_lock helpers
>> * [PATCH v6 6/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
>>
>> and found the following issue:
>> INFO: task hung in io_wq_put_and_exit
>>
>> Full report is available here:
>> https://ci.syzbot.org/series/21eac721-670b-4f34-9696-66f9b28233ac
>>
>> ***
>>
>> INFO: task hung in io_wq_put_and_exit
>>
>> tree:      torvalds
>> URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base:      d358e5254674b70f34c847715ca509e46eb81e6f
>> arch:      amd64
>> compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>> config:    https://ci.syzbot.org/builds/1710cffe-7d78-4489-9aa1-823b8c2532ed/config
>> syz repro: https://ci.syzbot.org/findings/74ae8703-9484-4d82-aa78-84cc37dcb1ef/syz_repro
>>
>> INFO: task syz.1.18:6046 blocked for more than 143 seconds.
>>       Not tainted syzkaller #0
>>       Blocked by coredump.
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:syz.1.18        state:D stack:25672 pid:6046  tgid:6045  ppid:5971   task_flags:0x400548 flags:0x00080004
>> Call Trace:
>>  <TASK>
>>  context_switch kernel/sched/core.c:5256 [inline]
>>  __schedule+0x14bc/0x5000 kernel/sched/core.c:6863
>>  __schedule_loop kernel/sched/core.c:6945 [inline]
>>  schedule+0x165/0x360 kernel/sched/core.c:6960
>>  schedule_timeout+0x9a/0x270 kernel/time/sleep_timeout.c:75
>>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>>  __wait_for_common kernel/sched/completion.c:121 [inline]
>>  wait_for_common kernel/sched/completion.c:132 [inline]
>>  wait_for_completion+0x2bf/0x5d0 kernel/sched/completion.c:153
>>  io_wq_exit_workers io_uring/io-wq.c:1328 [inline]
>>  io_wq_put_and_exit+0x316/0x650 io_uring/io-wq.c:1356
>>  io_uring_clean_tctx+0x11f/0x1a0 io_uring/tctx.c:207
>>  io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:652
>>  io_uring_files_cancel include/linux/io_uring.h:19 [inline]
>>  do_exit+0x345/0x2310 kernel/exit.c:911
>>  do_group_exit+0x21c/0x2d0 kernel/exit.c:1112
>>  get_signal+0x1285/0x1340 kernel/signal.c:3034
>>  arch_do_signal_or_restart+0x9a/0x7a0 arch/x86/kernel/signal.c:337
>>  __exit_to_user_mode_loop kernel/entry/common.c:41 [inline]
>>  exit_to_user_mode_loop+0x87/0x4f0 kernel/entry/common.c:75
>>  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
>>  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
>>  syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
>>  syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
>>  do_syscall_64+0x2e3/0xf80 arch/x86/entry/syscall_64.c:100
>>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f6a8b58f7c9
>> RSP: 002b:00007f6a8c4a00e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: 0000000000000001 RBX: 00007f6a8b7e5fa8 RCX: 00007f6a8b58f7c9
>> RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00007f6a8b7e5fac
>> RBP: 00007f6a8b7e5fa0 R08: 3fffffffffffffff R09: 0000000000000000
>> R10: 0000000000000800 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f6a8b7e6038 R14: 00007ffcac96d220 R15: 00007ffcac96d308
>>  </TASK>
>> INFO: task iou-wrk-6046:6047 blocked for more than 143 seconds.
>>       Not tainted syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:iou-wrk-6046    state:D stack:27760 pid:6047  tgid:6045  ppid:5971   task_flags:0x404050 flags:0x00080002
>> Call Trace:
>>  <TASK>
>>  context_switch kernel/sched/core.c:5256 [inline]
>>  __schedule+0x14bc/0x5000 kernel/sched/core.c:6863
>>  __schedule_loop kernel/sched/core.c:6945 [inline]
>>  schedule+0x165/0x360 kernel/sched/core.c:6960
>>  schedule_timeout+0x9a/0x270 kernel/time/sleep_timeout.c:75
>>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>>  __wait_for_common kernel/sched/completion.c:121 [inline]
>>  wait_for_common kernel/sched/completion.c:132 [inline]
>>  wait_for_completion+0x2bf/0x5d0 kernel/sched/completion.c:153
>>  io_ring_ctx_lock_nested+0x2b3/0x380 io_uring/io_uring.h:283
>>  io_ring_ctx_lock io_uring/io_uring.h:290 [inline]
>>  io_ring_submit_lock io_uring/io_uring.h:554 [inline]
>>  io_files_update+0x677/0x7f0 io_uring/rsrc.c:504
>>  __io_issue_sqe+0x181/0x4b0 io_uring/io_uring.c:1818
>>  io_issue_sqe+0x1de/0x1190 io_uring/io_uring.c:1841
>>  io_wq_submit_work+0x6e9/0xb90 io_uring/io_uring.c:1953
>>  io_worker_handle_work+0x7cd/0x1180 io_uring/io-wq.c:650
>>  io_wq_worker+0x42f/0xeb0 io_uring/io-wq.c:704
>>  ret_from_fork+0x599/0xb30 arch/x86/kernel/process.c:158
>>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
>>  </TASK>
> 
> Interesting, a deadlock between io_wq_exit_workers() on submitter_task
> (which is exiting) and io_ring_ctx_lock() on an io_uring worker
> thread. io_ring_ctx_lock() is blocked until submitter_task runs task
> work, but that will never happen because it's waiting on the
> completion. Not sure what the best approach is here. Maybe have the
> submitter_task alternate between running task work and waiting on the
> completion? Or have some way for submitter_task to indicate that it's
> exiting and disable the IORING_SETUP_SINGLE_ISSUER optimization in
> io_ring_ctx_lock()?

Finally got around to taking a look at this patchset today, and it does
look sound to me. For cases that have zero expected io-wq activity, then
it seems like a no-brainer. For cases that have a lot of expected io-wq
activity, which are basically only things like fs/storage workloads on
suboptimal configurations, the then the suspend/resume mechanism may be
troublesome. But not quite sure what to do about that, or if it's evne
noticable?

For the case in question, yes I think we'll need the completion wait
cases to break for running task_work.

-- 
Jens Axboe

  reply	other threads:[~2026-01-18 18:34 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-18  2:44 [PATCH v6 0/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 1/6] io_uring: use release-acquire ordering for IORING_SETUP_R_DISABLED Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 2/6] io_uring: clear IORING_SETUP_SINGLE_ISSUER for IORING_SETUP_SQPOLL Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 3/6] io_uring: ensure submitter_task is valid for io_ring_ctx's lifetime Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 4/6] io_uring: use io_ring_submit_lock() in io_iopoll_req_issued() Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 5/6] io_uring: factor out uring_lock helpers Caleb Sander Mateos
2025-12-18  2:44 ` [PATCH v6 6/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-12-18  8:01 ` [syzbot ci] " syzbot ci
2025-12-22 20:19   ` Caleb Sander Mateos
2026-01-18 18:34     ` Jens Axboe [this message]
2026-01-20 20:54       ` Caleb Sander Mateos
  -- strict thread matches above, loose matches on Subject: below --
2025-12-15 20:09 [PATCH v5 0/6] " Caleb Sander Mateos
2025-12-16  5:21 ` [syzbot ci] " syzbot ci
2025-12-18  1:24   ` Caleb Sander Mateos
2025-11-25 23:39 [PATCH v3 0/4] " Caleb Sander Mateos
2025-11-26  8:15 ` [syzbot ci] " syzbot ci
2025-11-26 17:30   ` Caleb Sander Mateos
2025-09-03  3:26 [PATCH 0/4] " Caleb Sander Mateos
2025-09-03 21:55 ` [syzbot ci] " syzbot ci
2025-09-03 23:29   ` Jens Axboe
2025-09-04 14:52     ` Caleb Sander Mateos
2025-09-04 16:46       ` Caleb Sander Mateos
2025-09-04 16:50         ` Caleb Sander Mateos
2025-09-04 23:25           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bc36797-fe4e-46ba-933d-0b3d508ed0dd@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=io-uring@vger.kernel.org \
    --cc=joannelkoong@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=syzbot+ci6d21afd0455de45a@syzkaller.appspotmail.com \
    --cc=syzbot@lists.linux.dev \
    --cc=syzbot@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.