qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Brian <hibriansong@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>, Hanna Czenczek <hreitz@redhat.com>
Subject: Re: [RFC 08/11] aio-posix: gracefully handle io_uring_queue_init() failure
Date: Mon, 2 Jun 2025 18:37:08 -0400	[thread overview]
Message-ID: <736d9b1a-af8f-404a-9fda-c44e8a4c6df3@gmail.com> (raw)
In-Reply-To: <20250602202051.GE300284@fedora>

[-- Attachment #1: Type: text/plain, Size: 6179 bytes --]


On 6/2/25 4:20 PM, Stefan Hajnoczi wrote:
> On Mon, Jun 02, 2025 at 08:26:39AM -0400, Brian wrote:
>> On 5/28/25 3:09 PM, Stefan Hajnoczi wrote:
>>> io_uring may not be available at runtime due to system policies (e.g.
>>> the io_uring_disabled sysctl) or creation could fail due to file
>>> descriptor resource limits.
>>>
>>> Handle failure scenarios as follows:
>>>
>>> If another AioContext already has io_uring, then fail AioContext
>>> creation so that the aio_add_sqe() API is available uniformly from all
>>> QEMU threads. Otherwise fall back to epoll(7) if io_uring is
>>> unavailable.
>>>
>>> Notes:
>>> - Update the comment about selecting the fastest fdmon implementation.
>>>     At this point it's not about speed anymore, it's about aio_add_sqe()
>>>     API availability.
>>> - Uppercase the error message when converting from error_report() to
>>>     error_setg_errno() for consistency (but there are instances of
>>>     lowercase in the codebase).
>>> - It's easier to move the #ifdefs from aio-posix.h to aio-posix.c.
>>>
>>> Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
>>> ---
>>>    util/aio-posix.h      | 12 ++----------
>>>    util/aio-posix.c      | 29 ++++++++++++++++++++++++++---
>>>    util/fdmon-io_uring.c |  8 ++++----
>>>    3 files changed, 32 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/util/aio-posix.h b/util/aio-posix.h
>>> index f9994ed79e..6f9d97d866 100644
>>> --- a/util/aio-posix.h
>>> +++ b/util/aio-posix.h
>>> @@ -18,6 +18,7 @@
>>>    #define AIO_POSIX_H
>>>    #include "block/aio.h"
>>> +#include "qapi/error.h" struct AioHandler { GPollFD pfd; @@ -72,17
>>> +73,8 @@ static inline void fdmon_epoll_disable(AioContext *ctx) #endif
>>> /* !CONFIG_EPOLL_CREATE1 */ #ifdef CONFIG_LINUX_IO_URING -bool
>>> fdmon_io_uring_setup(AioContext *ctx); +void
>>> fdmon_io_uring_setup(AioContext *ctx, Error **errp); void
>>> fdmon_io_uring_destroy(AioContext *ctx); -#else -static inline bool
>>> fdmon_io_uring_setup(AioContext *ctx) -{ - return false; -} - -static
>>> inline void fdmon_io_uring_destroy(AioContext *ctx) -{ -} #endif /*
>>> !CONFIG_LINUX_IO_URING */ #endif /* AIO_POSIX_H */ diff --git
>>> a/util/aio-posix.c b/util/aio-posix.c index fa047fc7ad..44b3df61f9
>>> 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -16,6 +16,7 @@
>>> #include "qemu/osdep.h"
>>>    #include "block/block.h"
>>>    #include "block/thread-pool.h"
>>> +#include "qapi/error.h"
>>>    #include "qemu/main-loop.h"
>>>    #include "qemu/lockcnt.h"
>>>    #include "qemu/rcu.h"
>>> @@ -717,17 +718,39 @@ void aio_context_setup(AioContext *ctx, Error **errp)
>>>        ctx->epollfd = -1;
>>>        ctx->epollfd_tag = NULL;
>>> -    /* Use the fastest fd monitoring implementation if available */
>>> -    if (fdmon_io_uring_setup(ctx)) {
>>> -        return;
>>> +#ifdef CONFIG_LINUX_IO_URING
>>> +    {
>>> +        static bool need_io_uring;
>>> +        Error *local_err = NULL; /* ERRP_GUARD() doesn't handle error_abort */
>>> +
>>> +        /* io_uring takes precedence because it provides aio_add_sqe() support */
>>> +        fdmon_io_uring_setup(ctx, &local_err);
>>> +        if (!local_err) {
>>> +            /*
>>> +             * If one AioContext gets io_uring, then all AioContexts need io_uring
>>> +             * so that aio_add_sqe() support is available across all threads.
>>> +             */
>>> +            need_io_uring = true;
>>> +            return;
>>> +        }
>>> +        if (need_io_uring) {
>>> +            error_propagate(errp, local_err);
>>> +            return;
>>> +        }
>>> +
>>> +        warn_report_err_once(local_err); /* frees local_err */
>>> +        local_err = NULL;
>>>        }
>>> +#endif /* CONFIG_LINUX_IO_URING */
>> Is there a problem with the logic of this code snippet?
>>
>> If we fail at fdmon_io_uring_setup, specifically at io_uring_queue_init,
>> local_err (or errp) will be set to a non-NULL error value. In that case,
>> need_io_uring will be set to true, but the function will return immediately.
> If local_err is non-NULL then this conditional is not taken:
>
>    if (!local_err) {
>        /*
>         * If one AioContext gets io_uring, then all AioContexts need io_uring
>         * so that aio_add_sqe() support is available across all threads.
>         */
>        need_io_uring = true;
>        return;
>    }
>
> If the logic you described is correct, please rephrase it. I don't see
> how what you've written can happen.

Sorry, I didn’t notice that need_io_uring is a static var. Was confused 
about when if (need_io_block) gets executed

>> As a result, the later if (need_io_uring) block will never be executed
>>
>>>        fdmon_epoll_setup(ctx);
>>>    }
>>>    void aio_context_destroy(AioContext *ctx)
>>>    {
>>> +#ifdef CONFIG_LINUX_IO_URING
>>>        fdmon_io_uring_destroy(ctx);
>>> +#endif
>>>        qemu_lockcnt_lock(&ctx->list_lock);
>>>        fdmon_epoll_disable(ctx);
>>> diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c
>>> index 2092d08d24..ef1a866a03 100644
>>> --- a/util/fdmon-io_uring.c
>>> +++ b/util/fdmon-io_uring.c
>>> @@ -45,6 +45,7 @@
>>>    #include "qemu/osdep.h"
>>>    #include <poll.h>
>>> +#include "qapi/error.h"
>>>    #include "qemu/rcu_queue.h"
>>>    #include "aio-posix.h"
>>> @@ -361,7 +362,7 @@ static const FDMonOps fdmon_io_uring_ops = {
>>>        .gsource_dispatch = fdmon_io_uring_gsource_dispatch,
>>>    };
>>> -bool fdmon_io_uring_setup(AioContext *ctx)
>>> +void fdmon_io_uring_setup(AioContext *ctx, Error **errp)
>>>    {
>>>        int ret;
>>> @@ -369,15 +370,14 @@ bool fdmon_io_uring_setup(AioContext *ctx)
>>>        ret = io_uring_queue_init(FDMON_IO_URING_ENTRIES, &ctx->fdmon_io_uring, 0);
>>>        if (ret != 0) {
>>> -        return false;
>>> +        error_setg_errno(errp, -ret, "Failed to initialize io_uring");
>>> +        return;
>>>        }
>>>        QSLIST_INIT(&ctx->submit_list);
>>>        ctx->fdmon_ops = &fdmon_io_uring_ops;
>>>        ctx->io_uring_fd_tag = g_source_add_unix_fd(&ctx->source,
>>>                ctx->fdmon_io_uring.ring_fd, G_IO_IN);
>>> -
>>> -    return true;
>>>    }
>>>    void fdmon_io_uring_destroy(AioContext *ctx)

[-- Attachment #2: Type: text/html, Size: 6862 bytes --]

  reply	other threads:[~2025-06-02 22:38 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-28 19:09 [RFC 00/11] aio: add the aio_add_sqe() io_uring API Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 01/11] aio-posix: fix polling mode with fdmon-io_uring Stefan Hajnoczi
2025-05-28 20:29   ` Eric Blake
2025-05-28 19:09 ` [RFC 02/11] aio-posix: keep polling enabled with fdmon-io_uring.c Stefan Hajnoczi
2025-05-28 20:34   ` Eric Blake
2025-05-28 19:09 ` [RFC 03/11] tests/unit: skip test-nested-aio-poll with io_uring Stefan Hajnoczi
2025-05-28 20:40   ` Eric Blake
2025-05-28 19:09 ` [RFC 04/11] aio-posix: integrate fdmon into glib event loop Stefan Hajnoczi
2025-05-28 21:01   ` Eric Blake
2025-05-28 19:09 ` [RFC 05/11] aio: remove aio_context_use_g_source() Stefan Hajnoczi
2025-05-28 21:02   ` Eric Blake
2025-05-28 19:09 ` [RFC 06/11] aio: free AioContext when aio_context_new() fails Stefan Hajnoczi
2025-05-28 21:06   ` Eric Blake
2025-06-05 17:49     ` Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 07/11] aio: add errp argument to aio_context_setup() Stefan Hajnoczi
2025-05-28 21:07   ` Eric Blake
2025-05-28 19:09 ` [RFC 08/11] aio-posix: gracefully handle io_uring_queue_init() failure Stefan Hajnoczi
2025-05-28 22:12   ` Eric Blake
2025-05-29 15:38     ` Stefan Hajnoczi
2025-06-03  6:05       ` Markus Armbruster
2025-06-03 18:48         ` Stefan Hajnoczi
2025-06-02 12:26   ` Brian
2025-06-02 20:20     ` Stefan Hajnoczi
2025-06-02 22:37       ` Brian [this message]
2025-05-28 19:09 ` [RFC 09/11] aio-posix: add aio_add_sqe() API for user-defined io_uring requests Stefan Hajnoczi
2025-05-28 22:15   ` Eric Blake
2025-05-29 20:02   ` Eric Blake
2025-06-05 17:52     ` Stefan Hajnoczi
2025-05-28 19:09 ` [RFC 10/11] aio-posix: avoid EventNotifier for cqe_handler_bh Stefan Hajnoczi
2025-05-29 20:09   ` Eric Blake
2025-05-28 19:09 ` [RFC 11/11] block/io_uring: use aio_add_sqe() Stefan Hajnoczi
2025-05-29 21:11   ` Eric Blake
2025-06-05 18:40     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=736d9b1a-af8f-404a-9fda-c44e8a4c6df3@gmail.com \
    --to=hibriansong@gmail.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).