From: Fabiano Rosas <farosas@suse.de>
To: Thomas Huth <thuth@redhat.com>, qemu-devel@nongnu.org
Cc: Peter Xu <peterx@redhat.com>, Prasad Pandit <pjp@fedoraproject.org>
Subject: Re: [PULL 05/10] tests/qtest/migration: Force exit-on-error=false
Date: Thu, 26 Mar 2026 10:28:02 -0300 [thread overview]
Message-ID: <87cy0qk9el.fsf@suse.de> (raw)
In-Reply-To: <7002cee0-9287-4ff1-9580-eff97aa02566@redhat.com>
Thomas Huth <thuth@redhat.com> writes:
> On 17/03/2026 19.23, Fabiano Rosas wrote:
>> Some tests can cause QEMU to exit(1) too early while the incoming
>> coroutine has not yielded for a first time yet. This trips ASAN
>> because resources related to dispatching the incoming process will
>> still be allocated in the io/channel.c layer without a
>> straight-forward way for the migration code to clean them up.
>>
>> As an example of one such issue, the UUID validation happens early
>> enough that the temporary socket from qio_net_listener_channel_func()
>> still has an elevated refcount. If it fails, the listener dispatch
>> code never gets to free the resource:
>>
>> Direct leak of 400 byte(s) in 1 object(s) allocated from:
>> #0 0x55e668890a07 in malloc asan_malloc_linux.cpp:68:3
>> #1 0x7f3c7e2b6648 in g_malloc ../glib/gmem.c:130
>> #2 0x55e66a8ef05f in object_new_with_type ../qom/object.c:767:15
>> #3 0x55e66a8ef178 in object_new ../qom/object.c:789:12
>> #4 0x55e66a93bcc6 in qio_channel_socket_new ../io/channel-socket.c:70:31
>> #5 0x55e66a93f34f in qio_channel_socket_accept ../io/channel-socket.c:401:12
>> #6 0x55e66a96752a in qio_net_listener_channel_func ../io/net-listener.c:64:12
>> #7 0x55e66a94bdac in qio_channel_fd_source_dispatch ../io/channel-watch.c:84:12
>> #8 0x7f3c7e2adf4b in g_main_dispatch ../glib/gmain.c:3476
>> #9 0x7f3c7e2adf4b in g_main_context_dispatch_unlocked ../glib/gmain.c:4284
>> #10 0x7f3c7e2b00c8 in g_main_context_dispatch ../glib/gmain.c:4272
>>
>> The exit(1) also requires some tests to setup qtest to expect a return
>> code of 1 from the QEMU process. Although we can check migration
>> status changes to be fairly certain where the failure happened, there
>> is always the possibility of QEMU exiting for another reason and the
>> test passing. This happens frequently with sanitizers enabled, but
>> also risks masking issues in the regular build.
>>
>> Stop allowing the incoming migration to exit and instead require the
>> tests to wait for the FAILED state and end QEMU gracefully with
>> qtest_quit.
>>
>> In practice this means setting exit-on-error=false for every incoming
>> migration, changing MIG_TEST_FAIL_DEST_QUIT_ERR to MIG_TEST_FAIL and
>> waiting for a change of state where necessary.
>>
>> With this, the MIG_TEST_FAIL_DEST_QUIT_ERR error result is now unused,
>> remove it.
>>
>> The affected tests are:
>> validate_uuid_error
>> multifd_tcp_cancel
>> dirty_limit
>> precopy_unix_tls_x509_default_host
>> precopy_tcp_tls_no_hostname
>> tcp_tls_x509_mismatch_host
>> dbus_vmstate_missing_src
>> dbus_vmstate_missing_dst
>>
>> Also add a comment to QEMU source explaining that the incoming
>> coroutine might block for a while until it yields as this is the
>> actual root cause of the issue.
>>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
>> Link: https://lore.kernel.org/qemu-devel/20260311213418.16951-6-farosas@suse.de
>> [assert that key doesn't already exists]
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> migration/migration.c | 5 +++++
>> tests/qtest/dbus-vmstate-test.c | 5 +++--
>> tests/qtest/migration/framework.c | 5 +----
>> tests/qtest/migration/framework.h | 2 --
>> tests/qtest/migration/migration-qmp.c | 7 +++++++
>> tests/qtest/migration/misc-tests.c | 4 ++--
>> tests/qtest/migration/precopy-tests.c | 12 +++++-------
>> tests/qtest/migration/tls-tests.c | 14 ++++++++------
>> 8 files changed, 31 insertions(+), 23 deletions(-)
>
> Hi Fabiano,
>
> this patch now triggers a failure in the qtests when I'm running these in
> "SPEED=thorough" mode:
>
> MESON_TEST_ITERATION=1 MALLOC_PERTURB_=120
> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 G_TEST_SLOW=1
> PYTHON=/home/thuth/tmp/qemu-build/pyvenv/bin/python3 RUST_BACKTRACE=1
> QTEST_QEMU_IMG=./qemu-img
> G_TEST_DBUS_DAEMON=/home/thuth/devel/qemu/tests/dbus-vmstate-daemon.sh
> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
> QTEST_QEMU_BINARY=./qemu-system-x86_64
> /home/thuth/tmp/qemu-build/tests/qtest/migration-test --tap -k --full
>
> TAP version 14
> # random seed: R02Sb882c8142734dce2265e65214fd2b060
> # starting QEMU: exec ./qemu-system-x86_64 -qtest
> unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control
> -display none -audio none -run-with exit-with-parent=on -machine none -accel
> qtest
> # Skipping test: userfaultfd not available
> 1..80
> # Start of x86_64 tests
> # Running /x86_64/dirty_limit
> # Using machine type: pc-q35-11.0
> # starting QEMU: exec ./qemu-system-x86_64 -qtest
> unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control
> -display none -audio none -run-with exit-with-parent=on -accel
> kvm,dirty-ring-size=4096 -accel tcg -machine pc-q35-11.0, -name
> source,debug-threads=on -machine memory-backend=mig.mem -object
> memory-backend-ram,id=mig.mem,size=150M,share=off -serial
> file:/tmp/migration-test-8B95M3/src_serial -drive
> if=none,id=d0,file=/tmp/migration-test-8B95M3/bootsect,format=raw -device
> ide-hd,drive=d0,secs=1,cyls=1,heads=1 2>/dev/null -accel qtest
> # starting QEMU: exec ./qemu-system-x86_64 -qtest
> unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control
> -display none -audio none -run-with exit-with-parent=on -accel
> kvm,dirty-ring-size=4096 -accel tcg -machine pc-q35-11.0, -name
> target,debug-threads=on -machine memory-backend=mig.mem -object
> memory-backend-ram,id=mig.mem,size=150M,share=off -serial
> file:/tmp/migration-test-8B95M3/dest_serial -incoming
> unix:/tmp/migration-test-8B95M3/migsocket -drive
> if=none,id=d0,file=/tmp/migration-test-8B95M3/bootsect,format=raw -device
> ide-hd,drive=d0,secs=1,cyls=1,heads=1 2>/dev/null -accel qtest
> ../../devel/qemu/tests/qtest/libqtest.c:201: kill_qemu() tried to terminate
> QEMU process but encountered exit status 1 (expected 0)
> Aborted (core dumped)
>
> Could you please try whether you could reproduce that crash?
>
> Thomas
Argh, too many dirty this, dirty that. I'll send a patch. Thanks!
next prev parent reply other threads:[~2026-03-26 13:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 18:23 [PULL 00/10] Migration/Qtest patches for 2026-03-17 Fabiano Rosas
2026-03-17 18:23 ` [PULL 01/10] tests/qtest/migration: Fix leak of migration tests data Fabiano Rosas
2026-03-17 18:23 ` [PULL 02/10] io: Fix TLS bye task leak Fabiano Rosas
2026-03-18 20:36 ` Michael Tokarev
2026-03-19 8:57 ` Daniel P. Berrangé
2026-03-17 18:23 ` [PULL 03/10] tests/qtest/migration: Fix leak in CPR exec test Fabiano Rosas
2026-03-17 18:23 ` [PULL 04/10] migration/multifd: Fix leaks of TLS error objects Fabiano Rosas
2026-03-17 18:23 ` [PULL 05/10] tests/qtest/migration: Force exit-on-error=false Fabiano Rosas
2026-03-26 9:02 ` Thomas Huth
2026-03-26 13:28 ` Fabiano Rosas [this message]
2026-03-17 18:23 ` [PULL 06/10] migration: assert that the same migration handler is not being added twice Fabiano Rosas
2026-03-17 18:23 ` [PULL 07/10] migration/options: Fix leaks in StrOrNull qdev accessors Fabiano Rosas
2026-03-17 18:23 ` [PULL 08/10] migration: fix implicit integer division in migration_update_counters Fabiano Rosas
2026-03-17 18:23 ` [PULL 09/10] tests/qtest: Don't dup machine name in qtest_cb_for_every_machine callbacks Fabiano Rosas
2026-03-17 18:23 ` [PULL 10/10] tests/qtest/test-hmp: Free machine options Fabiano Rosas
2026-03-18 13:26 ` [PULL 00/10] Migration/Qtest patches for 2026-03-17 Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87cy0qk9el.fsf@suse.de \
--to=farosas@suse.de \
--cc=peterx@redhat.com \
--cc=pjp@fedoraproject.org \
--cc=qemu-devel@nongnu.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox