public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jann Horn <jannh@google.com>, Jens Axboe <axboe@kernel.dk>,
	io-uring <io-uring@vger.kernel.org>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>
Subject: Re: io_uring: incorrect assumption about mutex behavior on unlock?
Date: Fri, 1 Dec 2023 18:52:43 +0000	[thread overview]
Message-ID: <42ef8260-7f92-4312-9291-19301aea3c30@gmail.com> (raw)
In-Reply-To: <CAG48ez3xSoYb+45f1RLtktROJrpiDQ1otNvdR+YLQf7m+Krj5Q@mail.gmail.com>

On 12/1/23 16:41, Jann Horn wrote:
> mutex_unlock() has a different API contract compared to spin_unlock().
> spin_unlock() can be used to release ownership of an object, so that
> as soon as the spinlock is unlocked, another task is allowed to free
> the object containing the spinlock.
> mutex_unlock() does not support this kind of usage: The caller of
> mutex_unlock() must ensure that the mutex stays alive until
> mutex_unlock() has returned.
> (See the thread
> <https://lore.kernel.org/all/20231130204817.2031407-1-jannh@google.com/>
> which discusses adding documentation about this.)
> (POSIX userspace mutexes are different from kernel mutexes, in
> userspace this pattern is allowed.)
> 
> io_ring_exit_work() has a comment that seems to assume that the
> uring_lock (which is a mutex) can be used as if the spinlock-style API
> contract applied:
> 
>      /*
>      * Some may use context even when all refs and requests have been put,
>      * and they are free to do so while still holding uring_lock or
>      * completion_lock, see io_req_task_submit(). Apart from other work,
>      * this lock/unlock section also waits them to finish.
>      */
>      mutex_lock(&ctx->uring_lock);
> 

Oh crap. I'll check if there more suspects and patch it up, thanks

> I couldn't find any way in which io_req_task_submit() actually still
> relies on this. I think io_fallback_req_func() now relies on it,
> though I'm not sure whether that's intentional. ctx->fallback_work is
> flushed in io_ring_ctx_wait_and_kill(), but I think it can probably be
> restarted later on via:

Yes, io_fallback_req_func() relies on it, and it can be spinned up
asynchronously from different places, e.g. in-IRQ block request
completion.

> io_ring_exit_work -> io_move_task_work_from_local ->
> io_req_normal_work_add -> io_fallback_tw(sync=false) ->
> schedule_delayed_work
> 
> I think it is probably guaranteed that ctx->refs is non-zero when we
> enter io_fallback_req_func, since I think we can't enter
> io_fallback_req_func with an empty ctx->fallback_llist, and the
> requests queued up on ctx->fallback_llist have to hold refcounted
> references to the ctx. But by the time we reach the mutex_unlock(), I
> think we're not guaranteed to hold any references on the ctx anymore,
> and so the ctx could theoretically be freed in the middle of the
> mutex_unlock() call?

Right, it comes with refs but loses them in between lock()/unlock().

> I think that to make this code properly correct, it might be necessary
> to either add another flush_delayed_work() call after ctx->refs has
> dropped to zero and we know that the fallback work can't be restarted
> anymore, or create an extra ctx->refs reference that is dropped in
> io_fallback_req_func() after the mutex_unlock(). (Though I guess it's
> probably unlikely that this goes wrong in practice.)

-- 
Pavel Begunkov

      parent reply	other threads:[~2023-12-01 18:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-01 16:41 io_uring: incorrect assumption about mutex behavior on unlock? Jann Horn
2023-12-01 18:30 ` David Laight
2023-12-01 18:40   ` mutex/spinlock semantics [was: Re: io_uring: incorrect assumption about mutex behavior on unlock?] Jann Horn
2023-12-01 18:52 ` Pavel Begunkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42ef8260-7f92-4312-9291-19301aea3c30@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox