All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Brian <hibriansong@gmail.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>, Hanna Czenczek <hreitz@redhat.com>
Subject: Re: [RFC 08/11] aio-posix: gracefully handle io_uring_queue_init() failure
Date: Mon, 2 Jun 2025 16:20:51 -0400	[thread overview]
Message-ID: <20250602202051.GE300284@fedora> (raw)
In-Reply-To: <a0a98436-5e4e-46d1-9a66-b6edce5c0ecc@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6088 bytes --]

On Mon, Jun 02, 2025 at 08:26:39AM -0400, Brian wrote:
> On 5/28/25 3:09 PM, Stefan Hajnoczi wrote:
> > io_uring may not be available at runtime due to system policies (e.g.
> > the io_uring_disabled sysctl) or creation could fail due to file
> > descriptor resource limits.
> > 
> > Handle failure scenarios as follows:
> > 
> > If another AioContext already has io_uring, then fail AioContext
> > creation so that the aio_add_sqe() API is available uniformly from all
> > QEMU threads. Otherwise fall back to epoll(7) if io_uring is
> > unavailable.
> > 
> > Notes:
> > - Update the comment about selecting the fastest fdmon implementation.
> >    At this point it's not about speed anymore, it's about aio_add_sqe()
> >    API availability.
> > - Uppercase the error message when converting from error_report() to
> >    error_setg_errno() for consistency (but there are instances of
> >    lowercase in the codebase).
> > - It's easier to move the #ifdefs from aio-posix.h to aio-posix.c.
> > 
> > Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
> > ---
> >   util/aio-posix.h      | 12 ++----------
> >   util/aio-posix.c      | 29 ++++++++++++++++++++++++++---
> >   util/fdmon-io_uring.c |  8 ++++----
> >   3 files changed, 32 insertions(+), 17 deletions(-)
> > 
> > diff --git a/util/aio-posix.h b/util/aio-posix.h
> > index f9994ed79e..6f9d97d866 100644
> > --- a/util/aio-posix.h
> > +++ b/util/aio-posix.h
> > @@ -18,6 +18,7 @@
> >   #define AIO_POSIX_H
> >   #include "block/aio.h"
> > +#include "qapi/error.h" struct AioHandler { GPollFD pfd; @@ -72,17
> > +73,8 @@ static inline void fdmon_epoll_disable(AioContext *ctx) #endif
> > /* !CONFIG_EPOLL_CREATE1 */ #ifdef CONFIG_LINUX_IO_URING -bool
> > fdmon_io_uring_setup(AioContext *ctx); +void
> > fdmon_io_uring_setup(AioContext *ctx, Error **errp); void
> > fdmon_io_uring_destroy(AioContext *ctx); -#else -static inline bool
> > fdmon_io_uring_setup(AioContext *ctx) -{ - return false; -} - -static
> > inline void fdmon_io_uring_destroy(AioContext *ctx) -{ -} #endif /*
> > !CONFIG_LINUX_IO_URING */ #endif /* AIO_POSIX_H */ diff --git
> > a/util/aio-posix.c b/util/aio-posix.c index fa047fc7ad..44b3df61f9
> > 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -16,6 +16,7 @@
> > #include "qemu/osdep.h"
> >   #include "block/block.h"
> >   #include "block/thread-pool.h"
> > +#include "qapi/error.h"
> >   #include "qemu/main-loop.h"
> >   #include "qemu/lockcnt.h"
> >   #include "qemu/rcu.h"
> > @@ -717,17 +718,39 @@ void aio_context_setup(AioContext *ctx, Error **errp)
> >       ctx->epollfd = -1;
> >       ctx->epollfd_tag = NULL;
> > -    /* Use the fastest fd monitoring implementation if available */
> > -    if (fdmon_io_uring_setup(ctx)) {
> > -        return;
> > +#ifdef CONFIG_LINUX_IO_URING
> > +    {
> > +        static bool need_io_uring;
> > +        Error *local_err = NULL; /* ERRP_GUARD() doesn't handle error_abort */
> > +
> > +        /* io_uring takes precedence because it provides aio_add_sqe() support */
> > +        fdmon_io_uring_setup(ctx, &local_err);
> > +        if (!local_err) {
> > +            /*
> > +             * If one AioContext gets io_uring, then all AioContexts need io_uring
> > +             * so that aio_add_sqe() support is available across all threads.
> > +             */
> > +            need_io_uring = true;
> > +            return;
> > +        }
> > +        if (need_io_uring) {
> > +            error_propagate(errp, local_err);
> > +            return;
> > +        }
> > +
> > +        warn_report_err_once(local_err); /* frees local_err */
> > +        local_err = NULL;
> >       }
> > +#endif /* CONFIG_LINUX_IO_URING */
> Is there a problem with the logic of this code snippet?
> 
> If we fail at fdmon_io_uring_setup, specifically at io_uring_queue_init,
> local_err (or errp) will be set to a non-NULL error value. In that case,
> need_io_uring will be set to true, but the function will return immediately.

If local_err is non-NULL then this conditional is not taken:

  if (!local_err) {
      /*
       * If one AioContext gets io_uring, then all AioContexts need io_uring
       * so that aio_add_sqe() support is available across all threads.
       */
      need_io_uring = true;
      return;
  }

If the logic you described is correct, please rephrase it. I don't see
how what you've written can happen.

> As a result, the later if (need_io_uring) block will never be executed
> 
> >       fdmon_epoll_setup(ctx);
> >   }
> >   void aio_context_destroy(AioContext *ctx)
> >   {
> > +#ifdef CONFIG_LINUX_IO_URING
> >       fdmon_io_uring_destroy(ctx);
> > +#endif
> >       qemu_lockcnt_lock(&ctx->list_lock);
> >       fdmon_epoll_disable(ctx);
> > diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c
> > index 2092d08d24..ef1a866a03 100644
> > --- a/util/fdmon-io_uring.c
> > +++ b/util/fdmon-io_uring.c
> > @@ -45,6 +45,7 @@
> >   #include "qemu/osdep.h"
> >   #include <poll.h>
> > +#include "qapi/error.h"
> >   #include "qemu/rcu_queue.h"
> >   #include "aio-posix.h"
> > @@ -361,7 +362,7 @@ static const FDMonOps fdmon_io_uring_ops = {
> >       .gsource_dispatch = fdmon_io_uring_gsource_dispatch,
> >   };
> > -bool fdmon_io_uring_setup(AioContext *ctx)
> > +void fdmon_io_uring_setup(AioContext *ctx, Error **errp)
> >   {
> >       int ret;
> > @@ -369,15 +370,14 @@ bool fdmon_io_uring_setup(AioContext *ctx)
> >       ret = io_uring_queue_init(FDMON_IO_URING_ENTRIES, &ctx->fdmon_io_uring, 0);
> >       if (ret != 0) {
> > -        return false;
> > +        error_setg_errno(errp, -ret, "Failed to initialize io_uring");
> > +        return;
> >       }
> >       QSLIST_INIT(&ctx->submit_list);
> >       ctx->fdmon_ops = &fdmon_io_uring_ops;
> >       ctx->io_uring_fd_tag = g_source_add_unix_fd(&ctx->source,
> >               ctx->fdmon_io_uring.ring_fd, G_IO_IN);
> > -
> > -    return true;
> >   }
> >   void fdmon_io_uring_destroy(AioContext *ctx)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-06-02 22:27 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-28 19:09 [RFC 00/11] aio: add the aio_add_sqe() io_uring API Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 01/11] aio-posix: fix polling mode with fdmon-io_uring Stefan Hajnoczi
2025-05-28 20:29   ` Eric Blake
2025-05-28 19:09 ` [RFC 02/11] aio-posix: keep polling enabled with fdmon-io_uring.c Stefan Hajnoczi
2025-05-28 20:34   ` Eric Blake
2025-05-28 19:09 ` [RFC 03/11] tests/unit: skip test-nested-aio-poll with io_uring Stefan Hajnoczi
2025-05-28 20:40   ` Eric Blake
2025-05-28 19:09 ` [RFC 04/11] aio-posix: integrate fdmon into glib event loop Stefan Hajnoczi
2025-05-28 21:01   ` Eric Blake
2025-05-28 19:09 ` [RFC 05/11] aio: remove aio_context_use_g_source() Stefan Hajnoczi
2025-05-28 21:02   ` Eric Blake
2025-05-28 19:09 ` [RFC 06/11] aio: free AioContext when aio_context_new() fails Stefan Hajnoczi
2025-05-28 21:06   ` Eric Blake
2025-06-05 17:49     ` Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 07/11] aio: add errp argument to aio_context_setup() Stefan Hajnoczi
2025-05-28 21:07   ` Eric Blake
2025-05-28 19:09 ` [RFC 08/11] aio-posix: gracefully handle io_uring_queue_init() failure Stefan Hajnoczi
2025-05-28 22:12   ` Eric Blake
2025-05-29 15:38     ` Stefan Hajnoczi
2025-06-03  6:05       ` Markus Armbruster
2025-06-03 18:48         ` Stefan Hajnoczi
2025-06-02 12:26   ` Brian
2025-06-02 20:20     ` Stefan Hajnoczi [this message]
2025-06-02 22:37       ` Brian
2025-05-28 19:09 ` [RFC 09/11] aio-posix: add aio_add_sqe() API for user-defined io_uring requests Stefan Hajnoczi
2025-05-28 22:15   ` Eric Blake
2025-05-29 20:02   ` Eric Blake
2025-06-05 17:52     ` Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 10/11] aio-posix: avoid EventNotifier for cqe_handler_bh Stefan Hajnoczi
2025-05-29 20:09   ` Eric Blake
2025-05-28 19:09 ` [RFC 11/11] block/io_uring: use aio_add_sqe() Stefan Hajnoczi
2025-05-29 21:11   ` Eric Blake
2025-06-05 18:40     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250602202051.GE300284@fedora \
    --to=stefanha@redhat.com \
    --cc=hibriansong@gmail.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.