qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Thomas Huth <thuth@redhat.com>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	quintela@redhat.com, "Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel
Date: Mon, 6 Mar 2023 13:44:38 +0000	[thread overview]
Message-ID: <ZAXuRp4p7heAbFtF@work-vm> (raw)
In-Reply-To: <53ca67e4-fb2f-17ac-2087-9faa7aba5187@redhat.com>

* Thomas Huth (thuth@redhat.com) wrote:
> On 03/03/2023 13.05, Peter Maydell wrote:
> > On Fri, 3 Mar 2023 at 11:29, Thomas Huth <thuth@redhat.com> wrote:
> > > 
> > > On 03/03/2023 12.18, Peter Maydell wrote:
> > > > On Fri, 3 Mar 2023 at 09:10, Juan Quintela <quintela@redhat.com> wrote:
> > > > > 
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > On Thu, Mar 02, 2023 at 05:22:11PM +0000, Peter Maydell wrote:
> > > > > > > migration-test has been flaky for a long time, both in CI and
> > > > > > > otherwise:
> > > > > > > 
> > > > > > > https://gitlab.com/qemu-project/qemu/-/jobs/3806090216
> > > > > > > (a FreeBSD job)
> > > > > > >     32/648 ERROR:../tests/qtest/migration-helpers.c:205:wait_for_migration_status: assertion failed: (g_test_timer_elapsed() < MIGRATION_STATUS_WAIT_TIMEOUT) ERROR
> > > > > > > 
> > > > > > > on a local macos x86 box:
> > > > 
> > > > 
> > > > 
> > > > > What is really weird with this failure is that:
> > > > > - it only happens on non-x86
> > > > 
> > > > No, I have seen it on x86 macos, and x86 OpenBSD
> > > > 
> > > > > - on code that is not arch dependent
> > > > > - on cancel, what we really do there is close fd's for the multifd
> > > > >     channel threads to get out of the recv, i.e. again, nothing that
> > > > >     should be arch dependent.
> > > > 
> > > > I'm pretty sure that it tends to happen when the machine that's
> > > > running the test is heavily loaded. You probably have a race condition.
> > > 
> > > I think I can second that. IIRC I've seen it a couple of times on my x86
> > > laptop when running "make check -j$(nproc) SPEED=slow" here.
> > 
> > And another on-x86 failure case, just now, on the FreeBSD x86 CI job:
> > https://gitlab.com/qemu-project/qemu/-/jobs/3870165180
> 
> And FWIW, I just saw this while doing "make vm-build-netbsd J=4":
> 
> ▶  31/645 ERROR:../src/tests/qtest/migration-test.c:1841:test_migrate_auto_converge: 'got_stop' should be FALSE ERROR

That one is kind of interesting; this is an auto converge test - so it
tries to setup migration so it won't finish, to check that the auto
converge kicks in.  Except in this case the migration *did* finish
without the autoconverge (significantly) kicking in.

So I guess any of:
  a) The CPU thread never got much CPU time so not much dirtying
happened.
  b) The bandwidth calculations might be bad enough/course enough
that it's passing the (very low) bandwidth limit due to bad
approximation at bandwidth needed.
  c) The autoconverge jump happens fast enough for that loop
to hit the got_stop in the loop time of that loop.

I guess we could:
  i) Reduce the usleep in test_migrate_auto_converge
    (So it is more likely to correctly drop out of that loop
    as soon as autoconverge kicks in)
  ii) Reduce inc_pct so that autoconverge kicks in slower
  iii) Reduce max-bandwidth in migrate_ensure_non_converge
     even further.

Dave

>  31/645 qemu:qtest+qtest-i386 / qtest-i386/migration-test                                  ERROR          25.21s   killed by signal 6 SIGABRT
> > > > QTEST_QEMU_BINARY=./qemu-system-i386 MALLOC_PERTURB_=35 G_TEST_DBUS_DAEMON=/home/qemu/qemu-test.fYHKFz/src/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/qemu/qemu-test.fYHKFz/build/tests/qtest/migration-test --tap -k
> ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
> stderr:
> qemu: thread naming not supported on this host
> qemu: thread naming not supported on this host
> qemu: thread naming not supported on this host
> qemu: thread naming not supported on this host
> qemu: thread naming not supported on this host
> qemu: thread naming not supported on this host
> **
> ERROR:../src/tests/qtest/migration-test.c:1841:test_migrate_auto_converge: 'got_stop' should be FALSE
> 
> (test program exited with status code -6)
> 
>  Thomas
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2023-03-06 13:45 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-02 17:22 [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel Peter Maydell
2023-03-02 17:34 ` Daniel P. Berrangé
2023-03-03  9:10   ` Juan Quintela
2023-03-03  9:12     ` Daniel P. Berrangé
2023-03-03 11:18     ` Peter Maydell
2023-03-03 11:28       ` Thomas Huth
2023-03-03 11:43         ` Peter Maydell
2023-03-03 12:05         ` Peter Maydell
2023-03-06 13:08           ` Thomas Huth
2023-03-06 13:44             ` Dr. David Alan Gilbert [this message]
2023-03-06 14:00               ` Daniel P. Berrangé
2023-03-06 14:09                 ` Dr. David Alan Gilbert
2023-03-06 15:17                 ` Dr. David Alan Gilbert
2023-03-02 17:37 ` Dr. David Alan Gilbert
2023-03-02 22:25   ` Philippe Mathieu-Daudé
2023-03-03  7:43 ` Thomas Huth
2023-03-03  9:08 ` Juan Quintela
2023-03-04 15:39 ` Peter Maydell
2023-03-07  9:53   ` Peter Maydell
2023-03-12 14:06     ` Peter Maydell
2023-03-12 17:46       ` Peter Maydell
2023-03-14 10:11         ` Dr. David Alan Gilbert
2023-03-14 12:46           ` Peter Maydell
2023-03-14 13:05             ` Daniel P. Berrangé
2023-03-14 13:13             ` Dr. David Alan Gilbert
2023-03-14 16:46           ` Peter Xu
2023-03-14 17:48             ` Daniel P. Berrangé
2023-03-14 19:31             ` Peter Maydell
2023-03-14 20:51               ` Peter Xu
2023-03-22 20:15     ` Peter Maydell
2023-04-03 19:16       ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAXuRp4p7heAbFtF@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=berrange@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).