qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, Aarushi Mehta <mehta.aaru20@gmail.com>,
	Fam Zheng <fam@euphon.net>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Hanna Czenczek <hreitz@redhat.com>,
	eblake@redhat.com, qemu-block@nongnu.org, hibriansong@gmail.com,
	Stefan Weil <sw@weilnetz.de>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v4 05/12] aio: remove aio_context_use_g_source()
Date: Thu, 23 Oct 2025 15:43:06 -0400	[thread overview]
Message-ID: <20251023194306.GA62080@fedora> (raw)
In-Reply-To: <aPidpOuudtn6hlAm@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4768 bytes --]

On Wed, Oct 22, 2025 at 11:02:28AM +0200, Kevin Wolf wrote:
> Am 21.10.2025 um 21:10 hat Stefan Hajnoczi geschrieben:
> > On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote:
> > > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben:
> > > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben:
> > > > > There is no need for aio_context_use_g_source() now that epoll(7) and
> > > > > io_uring(7) file descriptor monitoring works with the glib event loop.
> > > > > AioContext doesn't need to be notified that GSource is being used.
> > > > > 
> > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > Reviewed-by: Eric Blake <eblake@redhat.com>
> > > > 
> > > > We should probably mention in the commit message that this causes the
> > > > default fdmon on Linux to change from poll to io_uring. It's a small
> > > > code change, but it makes QEMU use a completely different code path by
> > > > default.
> > > 
> > > Just to make sure, I ran 'make check' after this patch and it's failing
> > > for me:
> > > 
> > >  10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test                    TIMEOUT        150.02s   killed by signal 15 SIGTERM
> > > 133/401 qemu:unit / test-aio                                                TIMEOUT         30.01s   killed by signal 15 SIGTERM
> > > 137/401 qemu:unit / test-bdrv-drain                                         TIMEOUT         30.01s   killed by signal 15 SIGTERM
> > > 142/401 qemu:unit / test-block-iothread                                     TIMEOUT         30.01s   killed by signal 15 SIGTERM
> > > 192/401 qemu:doc+rust / rust-bql-rs-doctests                                FAIL             0.84s   exit status 101
> > > 311/401 qemu:block / io-qcow2-267                                           ERROR            3.20s   exit status 1
> > > 321/401 qemu:block / io-qcow2-copy-before-write                             TIMEOUT        180.01s   killed by signal 15 SIGTERM
> > > 
> > > Some of them look unrelated, but I have confirmed that the three unit
> > > tests still pass before this patch (and still hang after the complete
> > > series).
> > 
> > I can't reproduce these failures, regardless of whether sysctl
> > kernel.io_uring_disabled is 0 or 1.
> > 
> > Can you launch the unit tests from your terminal and post the output?
> > 
> >   $ cd qemu
> >   $ build/tests/unit/test-aio
> 
> TAP version 14
> # random seed: R02S48dcdde28634143f18bad3947c52d334
> 1..27
> # Start of aio tests
> # Start of bh tests
> ok 1 /aio/bh/schedule
> ok 2 /aio/bh/schedule10
> ok 3 /aio/bh/cancel
> ok 4 /aio/bh/delete
> ok 5 /aio/bh/flush
> # Start of callback-delete tests
> ok 6 /aio/bh/callback-delete/one
> ok 7 /aio/bh/callback-delete/many
> # End of callback-delete tests
> # End of bh tests
> # Start of event tests
> ok 8 /aio/event/add-remove
> ok 9 /aio/event/wait
> ok 10 /aio/event/flush
> # Start of wait tests
> ok 11 /aio/event/wait/no-flush-cb
> # End of wait tests
> # End of event tests
> # Start of timer tests
> 
> >   $ build/tests/unit/test-bdrv-drain
> 
> TAP version 14
> # random seed: R02S7d6ba0fc81d5b90d323813d680a30644
> 1..30
> # Start of bdrv-drain tests
> ok 1 /bdrv-drain/nested
> ok 2 /bdrv-drain/set_aio_context
> # Start of driver-cb tests
> 
> 
> >   $ build/tests/unit/test-block-iothread
> 
> TAP version 14
> # random seed: R02Sf81baf68887daa9b86be5c72b99df589
> 1..22
> # Start of sync-op tests
> ok 1 /sync-op/pread
> ok 2 /sync-op/pwrite
> ok 3 /sync-op/preadv
> ok 4 /sync-op/pwritev
> ok 5 /sync-op/preadv_part
> ok 6 /sync-op/pwritev_part
> ok 7 /sync-op/pwrite_compressed
> ok 8 /sync-op/pwrite_zeroes
> ok 9 /sync-op/load_vmstate
> ok 10 /sync-op/save_vmstate
> ok 11 /sync-op/pdiscard
> ok 12 /sync-op/truncate
> ok 13 /sync-op/block_status
> ok 14 /sync-op/flush
> ok 15 /sync-op/check
> ok 16 /sync-op/activate
> # End of sync-op tests
> # Start of attach tests
> 
> > That will show exactly which sub-test case is hanging.
> > Other information that might help: your host kernel version and liburing
> > version.
> 
> This is a F42 system.
> 
> kernel-6.16.12-200.fc42.x86_64
> liburing-2.9-1.fc42.x86_64
> 
> If you can't reproduce or find a hypothesis what's happening, I can try
> to debug one of the hanging processes.

Unfortunately I haven't been able to reproduce it on my system. It's
a F42 machine with the same package versions as your machine.

The test-aio timer tests look like good candidates for debugging. It is
likely that the test is either getting to an infinite do {} while
(!aio_poll(ctx, false)) loop or to an aio_poll(ctx, true) call that
hangs.

Thanks for your help with debugging!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-10-23 21:07 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-10 17:56 [PATCH v4 00/12] aio: add the aio_add_sqe() io_uring API Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 01/12] aio-posix: fix race between io_uring CQE and AioHandler deletion Stefan Hajnoczi
2025-10-09 14:16   ` Kevin Wolf
2025-09-10 17:56 ` [PATCH v4 02/12] aio-posix: keep polling enabled with fdmon-io_uring.c Stefan Hajnoczi
2025-10-09 14:19   ` Kevin Wolf
2025-09-10 17:56 ` [PATCH v4 03/12] tests/unit: skip test-nested-aio-poll with io_uring Stefan Hajnoczi
2025-10-09 14:20   ` Kevin Wolf
2025-10-20 18:53     ` Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 04/12] aio-posix: integrate fdmon into glib event loop Stefan Hajnoczi
2025-10-09 15:25   ` Kevin Wolf
2025-10-20 20:08     ` Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 05/12] aio: remove aio_context_use_g_source() Stefan Hajnoczi
2025-10-09 15:46   ` Kevin Wolf
2025-10-09 16:59     ` Kevin Wolf
2025-10-21 19:10       ` Stefan Hajnoczi
2025-10-22  9:02         ` Kevin Wolf
2025-10-23 19:43           ` Stefan Hajnoczi [this message]
2025-10-23 20:47       ` Stefan Hajnoczi
2025-10-21 19:01     ` Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 06/12] aio: free AioContext when aio_context_new() fails Stefan Hajnoczi
2025-10-09 16:06   ` Kevin Wolf
2025-10-21 20:42     ` Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 07/12] aio: add errp argument to aio_context_setup() Stefan Hajnoczi
2025-10-09 16:16   ` Kevin Wolf
2025-10-14 19:48     ` Stefan Hajnoczi
2025-09-10 17:56 ` [PATCH v4 08/12] aio-posix: gracefully handle io_uring_queue_init() failure Stefan Hajnoczi
2025-10-09 16:19   ` Kevin Wolf
2025-10-14 19:49     ` Stefan Hajnoczi
2025-09-10 17:57 ` [PATCH v4 09/12] aio-posix: add aio_add_sqe() API for user-defined io_uring requests Stefan Hajnoczi
2025-10-10 15:23   ` Kevin Wolf
2025-10-10 16:20     ` Kevin Wolf
2025-10-23 20:18       ` Stefan Hajnoczi
2025-10-23 20:09     ` Stefan Hajnoczi
2025-09-10 17:57 ` [PATCH v4 10/12] aio-posix: avoid EventNotifier for cqe_handler_bh Stefan Hajnoczi
2025-09-10 17:57 ` [PATCH v4 11/12] block/io_uring: use aio_add_sqe() Stefan Hajnoczi
2025-09-10 17:57 ` [PATCH v4 12/12] block/io_uring: use non-vectored read/write when possible Stefan Hajnoczi
2025-10-10 16:33   ` Kevin Wolf
2025-10-14 19:52     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251023194306.GA62080@fedora \
    --to=stefanha@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=hibriansong@gmail.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).