From: Marco Elver <elver@google.com>
To: asml.silence@gmail.com, axboe@kernel.dk
Cc: syzbot <syzbot+73554e2258b7b8bf0bbf@syzkaller.appspotmail.com>,
io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
syzkaller-bugs@googlegroups.com, dvyukov@google.com
Subject: Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
Date: Wed, 26 May 2021 17:48:25 +0200 [thread overview]
Message-ID: <YK5tyZNAFc8dh6ke@elver.google.com> (raw)
In-Reply-To: <000000000000fa9f7005c33d83b9@google.com>
On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
> dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
> compiler: Debian clang version 11.0.1-2
[...]
> write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
> io_uring_clean_tctx fs/io_uring.c:9042 [inline]
> __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
> io_uring_files_cancel include/linux/io_uring.h:16 [inline]
> do_exit+0x185/0x1560 kernel/exit.c:781
> do_group_exit+0xce/0x1a0 kernel/exit.c:923
> get_signal+0xfc3/0x1610 kernel/signal.c:2835
> arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
> handle_signal_work kernel/entry/common.c:147 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
> exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
> __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
> syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
> do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
> io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
> io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
> io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
> process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
> worker_thread+0x636/0xae0 kernel/workqueue.c:2422
> kthread+0x1d0/0x1f0 kernel/kthread.c:313
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
I wasn't entirely sure if io_wq is guaranteed to remain live in this
case in io_uring_try_cancel_iowq(), but the comment there suggests it
does. In that case, I think the below patch would explain the situation
better and also propose a fix.
Thoughts?
Thanks,
-- Marco
------ >8 ------
From: Marco Elver <elver@google.com>
Date: Wed, 26 May 2021 16:56:37 +0200
Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref
Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between between accesses to tctx->io_wq:
write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
io_uring_clean_tctx fs/io_uring.c:9042 [inline]
__io_uring_cancel fs/io_uring.c:9136
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit kernel/exit.c:781
do_group_exit kernel/exit.c:923
get_signal kernel/signal.c:2835
arch_do_signal_or_restart arch/x86/kernel/signal.c:789
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
...
read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
io_uring_try_cancel_requests fs/io_uring.c:8933
io_ring_exit_work fs/io_uring.c:8736
process_one_work kernel/workqueue.c:2276
...
With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:
[CPU 0] | [CPU 1]
io_uring_try_cancel_iowq() | io_uring_clean_tctx()
if (!tctx->io_wq) // false | ...
... | tctx->io_wq = NULL
io_wq_cancel_cb(tctx->io_wq, ...) | ...
-> NULL-deref |
Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!
Fix the data race by atomically accessing tctx->io_wq. Of course, this
assumes that a valid io_wq remains alive for the duration of
io_uring_try_cancel_iowq(), which should be the case per comment there.
Reported-by: syzbot+bf2b3d0435b9b728946c@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
---
fs/io_uring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954004f6..c7e27b464cb6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8903,12 +8903,16 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
mutex_lock(&ctx->uring_lock);
list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
struct io_uring_task *tctx = node->task->io_uring;
+ struct io_wq *io_wq;
+ if (!tctx)
+ continue;
/*
* io_wq will stay alive while we hold uring_lock, because it's
* killed after ctx nodes, which requires to take the lock.
*/
- if (!tctx || !tctx->io_wq)
+ io_wq = READ_ONCE(tctx->io_wq);
+ if (!io_wq)
continue;
cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
@@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
struct io_tctx_node *node;
unsigned long index;
- tctx->io_wq = NULL;
+ WRITE_ONCE(tctx->io_wq, NULL);
xa_for_each(&tctx->xa, index, node)
io_uring_del_task_file(index);
if (wq)
--
2.31.1.818.g46aad6cb9e-goog
next prev parent reply other threads:[~2021-05-26 15:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot
2021-05-26 15:48 ` Marco Elver [this message]
2021-05-26 15:52 ` Marco Elver
2021-05-26 16:29 ` Pavel Begunkov
2021-05-26 16:33 ` Pavel Begunkov
2021-05-26 16:36 ` Marco Elver
2021-05-26 20:31 ` Pavel Begunkov
2021-05-27 9:32 ` Marco Elver
2021-05-27 10:05 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YK5tyZNAFc8dh6ke@elver.google.com \
--to=elver@google.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=dvyukov@google.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+73554e2258b7b8bf0bbf@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.