Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marco Elver <elver@google.com>
To: asml.silence@gmail.com, axboe@kernel.dk
Cc: syzbot <syzbot+73554e2258b7b8bf0bbf@syzkaller.appspotmail.com>,
	io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com, dvyukov@google.com
Subject: Re: [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests
Date: Wed, 26 May 2021 17:48:25 +0200	[thread overview]
Message-ID: <YK5tyZNAFc8dh6ke@elver.google.com> (raw)
In-Reply-To: <000000000000fa9f7005c33d83b9@google.com>

On Wed, May 26, 2021 at 08:44AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    a050a6d2 Merge tag 'perf-tools-fixes-for-v5.13-2021-05-24'..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13205087d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3bcc8a6b51ef8094
> dashboard link: https://syzkaller.appspot.com/bug?extid=73554e2258b7b8bf0bbf
> compiler:       Debian clang version 11.0.1-2
[...]
> write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
>  io_uring_clean_tctx fs/io_uring.c:9042 [inline]
>  __io_uring_cancel+0x261/0x3b0 fs/io_uring.c:9136
>  io_uring_files_cancel include/linux/io_uring.h:16 [inline]
>  do_exit+0x185/0x1560 kernel/exit.c:781
>  do_group_exit+0xce/0x1a0 kernel/exit.c:923
>  get_signal+0xfc3/0x1610 kernel/signal.c:2835
>  arch_do_signal_or_restart+0x2a/0x220 arch/x86/kernel/signal.c:789
>  handle_signal_work kernel/entry/common.c:147 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
>  exit_to_user_mode_prepare+0x109/0x190 kernel/entry/common.c:208
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
>  syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
>  do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
>  io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline]
>  io_uring_try_cancel_requests+0x1ce/0x8e0 fs/io_uring.c:8933
>  io_ring_exit_work+0x7c/0x1110 fs/io_uring.c:8736
>  process_one_work+0x3e9/0x8f0 kernel/workqueue.c:2276
>  worker_thread+0x636/0xae0 kernel/workqueue.c:2422
>  kthread+0x1d0/0x1f0 kernel/kthread.c:313
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

I wasn't entirely sure if io_wq is guaranteed to remain live in this
case in io_uring_try_cancel_iowq(), but the comment there suggests it
does. In that case, I think the below patch would explain the situation
better and also propose a fix.

Thoughts?

Thanks,
-- Marco

------ >8 ------

From: Marco Elver <elver@google.com>
Date: Wed, 26 May 2021 16:56:37 +0200
Subject: [PATCH] io_uring: fix data race to avoid potential NULL-deref

Commit ba5ef6dc8a82 ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between between accesses to tctx->io_wq:

  write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
   io_uring_clean_tctx                  fs/io_uring.c:9042 [inline]
   __io_uring_cancel                    fs/io_uring.c:9136
   io_uring_files_cancel                include/linux/io_uring.h:16 [inline]
   do_exit                              kernel/exit.c:781
   do_group_exit                        kernel/exit.c:923
   get_signal                           kernel/signal.c:2835
   arch_do_signal_or_restart            arch/x86/kernel/signal.c:789
   handle_signal_work                   kernel/entry/common.c:147 [inline]
   exit_to_user_mode_loop               kernel/entry/common.c:171 [inline]
   ...
  read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
   io_uring_try_cancel_iowq             fs/io_uring.c:8911 [inline]
   io_uring_try_cancel_requests         fs/io_uring.c:8933
   io_ring_exit_work                    fs/io_uring.c:8736
   process_one_work                     kernel/workqueue.c:2276
   ...

With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:

              [CPU 0]                 |        [CPU 1]
  io_uring_try_cancel_iowq()          | io_uring_clean_tctx()
    if (!tctx->io_wq) // false        |   ...
    ...                               |   tctx->io_wq = NULL
    io_wq_cancel_cb(tctx->io_wq, ...) |   ...
      -> NULL-deref                   |

Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!

Fix the data race by atomically accessing tctx->io_wq. Of course, this
assumes that a valid io_wq remains alive for the duration of
io_uring_try_cancel_iowq(), which should be the case per comment there.

Reported-by: syzbot+bf2b3d0435b9b728946c@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5f82954004f6..c7e27b464cb6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8903,12 +8903,16 @@ static bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
 	mutex_lock(&ctx->uring_lock);
 	list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
 		struct io_uring_task *tctx = node->task->io_uring;
+		struct io_wq *io_wq;

+		if (!tctx)
+			continue;
 		/*
 		 * io_wq will stay alive while we hold uring_lock, because it's
 		 * killed after ctx nodes, which requires to take the lock.
 		 */
-		if (!tctx || !tctx->io_wq)
+		io_wq = READ_ONCE(tctx->io_wq);
+		if (!io_wq)
 			continue;
 		cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
 		ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
@@ -9039,7 +9043,7 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx)
 	struct io_tctx_node *node;
 	unsigned long index;

-	tctx->io_wq = NULL;
+	WRITE_ONCE(tctx->io_wq, NULL);
 	xa_for_each(&tctx->xa, index, node)
 		io_uring_del_task_file(index);
 	if (wq)
-- 
2.31.1.818.g46aad6cb9e-goog

next prev parent reply	other threads:[~2021-05-26 15:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 15:44 [syzbot] KCSAN: data-race in __io_uring_cancel / io_uring_try_cancel_requests syzbot
2021-05-26 15:48 ` Marco Elver [this message]
2021-05-26 15:52   ` Marco Elver
2021-05-26 16:29     ` Pavel Begunkov
2021-05-26 16:33       ` Pavel Begunkov
2021-05-26 16:36       ` Marco Elver
2021-05-26 20:31         ` Pavel Begunkov
2021-05-27  9:32           ` Marco Elver
2021-05-27 10:05             ` Pavel Begunkov

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:5f82954004f dfblob:c7e27b464cb )
 OR (
bs:"io_uring: fix data race to avoid potential NULL-deref" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YK5tyZNAFc8dh6ke@elver.google.com \
    --to=elver@google.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dvyukov@google.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+73554e2258b7b8bf0bbf@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.