All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: Fabiano Rosas <farosas@suse.de>, qemu-devel@nongnu.org
Cc: Peter Xu <peterx@redhat.com>, Prasad Pandit <pjp@fedoraproject.org>
Subject: Re: [PULL 05/10] tests/qtest/migration: Force exit-on-error=false
Date: Thu, 26 Mar 2026 10:02:53 +0100	[thread overview]
Message-ID: <7002cee0-9287-4ff1-9580-eff97aa02566@redhat.com> (raw)
In-Reply-To: <20260317182320.31991-6-farosas@suse.de>

On 17/03/2026 19.23, Fabiano Rosas wrote:
> Some tests can cause QEMU to exit(1) too early while the incoming
> coroutine has not yielded for a first time yet. This trips ASAN
> because resources related to dispatching the incoming process will
> still be allocated in the io/channel.c layer without a
> straight-forward way for the migration code to clean them up.
> 
> As an example of one such issue, the UUID validation happens early
> enough that the temporary socket from qio_net_listener_channel_func()
> still has an elevated refcount. If it fails, the listener dispatch
> code never gets to free the resource:
> 
> Direct leak of 400 byte(s) in 1 object(s) allocated from:
>      #0 0x55e668890a07 in malloc asan_malloc_linux.cpp:68:3
>      #1 0x7f3c7e2b6648 in g_malloc ../glib/gmem.c:130
>      #2 0x55e66a8ef05f in object_new_with_type ../qom/object.c:767:15
>      #3 0x55e66a8ef178 in object_new ../qom/object.c:789:12
>      #4 0x55e66a93bcc6 in qio_channel_socket_new ../io/channel-socket.c:70:31
>      #5 0x55e66a93f34f in qio_channel_socket_accept ../io/channel-socket.c:401:12
>      #6 0x55e66a96752a in qio_net_listener_channel_func ../io/net-listener.c:64:12
>      #7 0x55e66a94bdac in qio_channel_fd_source_dispatch ../io/channel-watch.c:84:12
>      #8 0x7f3c7e2adf4b in g_main_dispatch ../glib/gmain.c:3476
>      #9 0x7f3c7e2adf4b in g_main_context_dispatch_unlocked ../glib/gmain.c:4284
>      #10 0x7f3c7e2b00c8 in g_main_context_dispatch ../glib/gmain.c:4272
> 
> The exit(1) also requires some tests to setup qtest to expect a return
> code of 1 from the QEMU process. Although we can check migration
> status changes to be fairly certain where the failure happened, there
> is always the possibility of QEMU exiting for another reason and the
> test passing. This happens frequently with sanitizers enabled, but
> also risks masking issues in the regular build.
> 
> Stop allowing the incoming migration to exit and instead require the
> tests to wait for the FAILED state and end QEMU gracefully with
> qtest_quit.
> 
> In practice this means setting exit-on-error=false for every incoming
> migration, changing MIG_TEST_FAIL_DEST_QUIT_ERR to MIG_TEST_FAIL and
> waiting for a change of state where necessary.
> 
> With this, the MIG_TEST_FAIL_DEST_QUIT_ERR error result is now unused,
> remove it.
> 
> The affected tests are:
> validate_uuid_error
> multifd_tcp_cancel
> dirty_limit
> precopy_unix_tls_x509_default_host
> precopy_tcp_tls_no_hostname
> tcp_tls_x509_mismatch_host
> dbus_vmstate_missing_src
> dbus_vmstate_missing_dst
> 
> Also add a comment to QEMU source explaining that the incoming
> coroutine might block for a while until it yields as this is the
> actual root cause of the issue.
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
> Link: https://lore.kernel.org/qemu-devel/20260311213418.16951-6-farosas@suse.de
> [assert that key doesn't already exists]
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>   migration/migration.c                 |  5 +++++
>   tests/qtest/dbus-vmstate-test.c       |  5 +++--
>   tests/qtest/migration/framework.c     |  5 +----
>   tests/qtest/migration/framework.h     |  2 --
>   tests/qtest/migration/migration-qmp.c |  7 +++++++
>   tests/qtest/migration/misc-tests.c    |  4 ++--
>   tests/qtest/migration/precopy-tests.c | 12 +++++-------
>   tests/qtest/migration/tls-tests.c     | 14 ++++++++------
>   8 files changed, 31 insertions(+), 23 deletions(-)

  Hi Fabiano,

this patch now triggers a failure in the qtests when I'm running these in 
"SPEED=thorough" mode:

MESON_TEST_ITERATION=1 MALLOC_PERTURB_=120 
ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 G_TEST_SLOW=1 
PYTHON=/home/thuth/tmp/qemu-build/pyvenv/bin/python3 RUST_BACKTRACE=1 
QTEST_QEMU_IMG=./qemu-img 
G_TEST_DBUS_DAEMON=/home/thuth/devel/qemu/tests/dbus-vmstate-daemon.sh 
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 
MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 
QTEST_QEMU_BINARY=./qemu-system-x86_64 
/home/thuth/tmp/qemu-build/tests/qtest/migration-test --tap -k --full

TAP version 14
# random seed: R02Sb882c8142734dce2265e65214fd2b060
# starting QEMU: exec ./qemu-system-x86_64 -qtest 
unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control 
-display none -audio none -run-with exit-with-parent=on -machine none -accel 
qtest
# Skipping test: userfaultfd not available
1..80
# Start of x86_64 tests
# Running /x86_64/dirty_limit
# Using machine type: pc-q35-11.0
# starting QEMU: exec ./qemu-system-x86_64 -qtest 
unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control 
-display none -audio none -run-with exit-with-parent=on -accel 
kvm,dirty-ring-size=4096 -accel tcg -machine pc-q35-11.0, -name 
source,debug-threads=on -machine memory-backend=mig.mem -object 
memory-backend-ram,id=mig.mem,size=150M,share=off -serial 
file:/tmp/migration-test-8B95M3/src_serial -drive 
if=none,id=d0,file=/tmp/migration-test-8B95M3/bootsect,format=raw -device 
ide-hd,drive=d0,secs=1,cyls=1,heads=1  2>/dev/null -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest 
unix:/tmp/qtest-106610.sock -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-106610.qmp,id=char0 -mon chardev=char0,mode=control 
-display none -audio none -run-with exit-with-parent=on -accel 
kvm,dirty-ring-size=4096 -accel tcg -machine pc-q35-11.0, -name 
target,debug-threads=on -machine memory-backend=mig.mem -object 
memory-backend-ram,id=mig.mem,size=150M,share=off -serial 
file:/tmp/migration-test-8B95M3/dest_serial -incoming 
unix:/tmp/migration-test-8B95M3/migsocket  -drive 
if=none,id=d0,file=/tmp/migration-test-8B95M3/bootsect,format=raw -device 
ide-hd,drive=d0,secs=1,cyls=1,heads=1  2>/dev/null -accel qtest
../../devel/qemu/tests/qtest/libqtest.c:201: kill_qemu() tried to terminate 
QEMU process but encountered exit status 1 (expected 0)
Aborted (core dumped)

Could you please try whether you could reproduce that crash?

  Thomas



  reply	other threads:[~2026-03-26  9:03 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 18:23 [PULL 00/10] Migration/Qtest patches for 2026-03-17 Fabiano Rosas
2026-03-17 18:23 ` [PULL 01/10] tests/qtest/migration: Fix leak of migration tests data Fabiano Rosas
2026-03-17 18:23 ` [PULL 02/10] io: Fix TLS bye task leak Fabiano Rosas
2026-03-18 20:36   ` Michael Tokarev
2026-03-19  8:57     ` Daniel P. Berrangé
2026-03-17 18:23 ` [PULL 03/10] tests/qtest/migration: Fix leak in CPR exec test Fabiano Rosas
2026-03-17 18:23 ` [PULL 04/10] migration/multifd: Fix leaks of TLS error objects Fabiano Rosas
2026-03-17 18:23 ` [PULL 05/10] tests/qtest/migration: Force exit-on-error=false Fabiano Rosas
2026-03-26  9:02   ` Thomas Huth [this message]
2026-03-26 13:28     ` Fabiano Rosas
2026-03-17 18:23 ` [PULL 06/10] migration: assert that the same migration handler is not being added twice Fabiano Rosas
2026-03-17 18:23 ` [PULL 07/10] migration/options: Fix leaks in StrOrNull qdev accessors Fabiano Rosas
2026-03-17 18:23 ` [PULL 08/10] migration: fix implicit integer division in migration_update_counters Fabiano Rosas
2026-03-17 18:23 ` [PULL 09/10] tests/qtest: Don't dup machine name in qtest_cb_for_every_machine callbacks Fabiano Rosas
2026-03-17 18:23 ` [PULL 10/10] tests/qtest/test-hmp: Free machine options Fabiano Rosas
2026-03-18 13:26 ` [PULL 00/10] Migration/Qtest patches for 2026-03-17 Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7002cee0-9287-4ff1-9580-eff97aa02566@redhat.com \
    --to=thuth@redhat.com \
    --cc=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=pjp@fedoraproject.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.