From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, Thomas Huth <thuth@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH 2/2] tests/qtest: make more migration pre-copy scenarios run non-live
Date: Thu, 20 Apr 2023 16:58:31 +0100 [thread overview]
Message-ID: <ZEFhJ5OxmKxxmJpH@redhat.com> (raw)
In-Reply-To: <87wn26wvzf.fsf@secure.mitica>
On Thu, Apr 20, 2023 at 02:59:00PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > There are 27 pre-copy live migration scenarios being tested. In all of
> > these we force non-convergance and run for one iteration, then let it
> > converge and wait for completion during the second (or following)
> > iterations. At 3 mbps bandwidth limit the first iteration takes a very
> > long time (~30 seconds).
> >
> > While it is important to test the migration passes and convergance
> > logic, it is overkill to do this for all 27 pre-copy scenarios. The
> > TLS migration scenarios in particular are merely exercising different
> > code paths during connection establishment.
> >
> > To optimize time taken, switch most of the test scenarios to run
> > non-live (ie guest CPUs paused) with no bandwidth limits. This gives
> > a massive speed up for most of the test scenarios.
> >
> > For test coverage the following scenarios are unchanged
> >
> > * Precopy with UNIX sockets
> > * Precopy with UNIX sockets and dirty ring tracking
> > * Precopy with XBZRLE
> > * Precopy with multifd
>
> Just for completeness: the other test that is still slow is
> /migration/vcpu_dirty_limit.
>
> > - migrate_ensure_non_converge(from);
> > + if (args->live) {
> > + migrate_ensure_non_converge(from);
> > + } else {
> > + migrate_ensure_converge(from);
> > + }
>
> Looks ... weird?
> But the only way that I can think of improving it is to pass args to
> migrate_ensure_*() and that is a different kind of weird.
I'm going to change this a little anyway. Currently for the non-live
case, I start the migration and then stop the CPUs. I want to reverse
that order, as we should have CPUs paused before even starting the
migration to ensure we don't have any re-dirtied pages at all.
>
> > } else {
> > - if (args->iterations) {
> > - while (args->iterations--) {
> > + if (args->live) {
> > + if (args->iterations) {
> > + while (args->iterations--) {
> > + wait_for_migration_pass(from);
> > + }
> > + } else {
> > wait_for_migration_pass(from);
> > }
> > +
> > + migrate_ensure_converge(from);
>
> I think we should change iterations to be 1 when we create args, but
> otherwise, treat 0 as 1 and change it to something in the lines of:
>
> if (args->live) {
> while (args->iterations-- >= 0) {
> wait_for_migration_pass(from);
> }
> migrate_ensure_converge(from);
>
> What do you think?
I think in retrospect 'iterations' was overkill as we only use the
values 0 (implicitly 1) or 2. IOW we could just just a 'bool multipass'
to distinguish the two cases.
> > - qtest_qmp_eventwait(to, "RESUME");
> > + if (!args->live) {
> > + qtest_qmp_discard_response(to, "{ 'execute' : 'cont'}");
> > + }
> > + if (!got_resume) {
> > + qtest_qmp_eventwait(to, "RESUME");
> > + }
> >
> > wait_for_serial("dest_serial");
> > }
>
> I was looking at the "culprit" of Lukas problem, and it is not directly
> obvious. I see that when we expect one event, we just drop any event
> that we are not interested in. I don't know if that is the proper
> behaviour or if that is what affecting this test.
I've not successfully reproduced it yet, nor figured out a real
scenario where it could plausibly happen. i'm looking to add more
debug to help us out.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
prev parent reply other threads:[~2023-04-20 15:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 13:30 [PATCH 0/2] tests/qtest: make migraton-test faster Daniel P. Berrangé
2023-04-18 13:30 ` [PATCH 1/2] tests/qtest: capture RESUME events during migration Daniel P. Berrangé
2023-04-20 11:32 ` Juan Quintela
2023-04-20 11:37 ` Daniel P. Berrangé
2023-04-18 13:31 ` [PATCH 2/2] tests/qtest: make more migration pre-copy scenarios run non-live Daniel P. Berrangé
2023-04-18 19:52 ` Fabiano Rosas
2023-04-19 17:14 ` Daniel P. Berrangé
2023-04-21 17:20 ` Daniel P. Berrangé
2023-04-20 12:59 ` Juan Quintela
2023-04-20 15:58 ` Daniel P. Berrangé [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZEFhJ5OxmKxxmJpH@redhat.com \
--to=berrange@redhat.com \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.