From: Fabiano Rosas <farosas@suse.de>
To: "Daniel P. Berrangé" <berrange@redhat.com>, qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
"Thomas Huth" <thuth@redhat.com>, "John Snow" <jsnow@redhat.com>,
"Li Zhijian" <lizhijian@fujitsu.com>,
"Juan Quintela" <quintela@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Zhang Chen" <chen.zhang@intel.com>,
"Laurent Vivier" <lvivier@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live
Date: Mon, 24 Apr 2023 18:01:36 -0300 [thread overview]
Message-ID: <87jzy1ro3z.fsf@suse.de> (raw)
In-Reply-To: <20230421171411.566300-5-berrange@redhat.com>
Daniel P. Berrangé <berrange@redhat.com> writes:
> There are 27 pre-copy live migration scenarios being tested. In all of
> these we force non-convergance and run for one iteration, then let it
> converge and wait for completion during the second (or following)
> iterations. At 3 mbps bandwidth limit the first iteration takes a very
> long time (~30 seconds).
>
> While it is important to test the migration passes and convergance
> logic, it is overkill to do this for all 27 pre-copy scenarios. The
> TLS migration scenarios in particular are merely exercising different
> code paths during connection establishment.
>
> To optimize time taken, switch most of the test scenarios to run
> non-live (ie guest CPUs paused) with no bandwidth limits. This gives
> a massive speed up for most of the test scenarios.
>
> For test coverage the following scenarios are unchanged
>
> * Precopy with UNIX sockets
> * Precopy with UNIX sockets and dirty ring tracking
> * Precopy with XBZRLE
> * Precopy with multifd
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
> tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++------
> 1 file changed, 50 insertions(+), 10 deletions(-)
>
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 6492ffa7fe..40d0f75480 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -568,6 +568,9 @@ typedef struct {
> MIG_TEST_FAIL_DEST_QUIT_ERR,
> } result;
>
> + /* Whether the guest CPUs should be running during migration */
> + bool live;
> +
> /* Postcopy specific fields */
> void *postcopy_data;
> bool postcopy_preempt;
> @@ -1324,8 +1327,6 @@ static void test_precopy_common(MigrateCommon *args)
> return;
> }
>
> - migrate_ensure_non_converge(from);
> -
> if (args->start_hook) {
> data_hook = args->start_hook(from, to);
> }
> @@ -1335,6 +1336,31 @@ static void test_precopy_common(MigrateCommon *args)
> wait_for_serial("src_serial");
> }
>
> + if (args->live) {
> + /*
> + * Testing live migration, we want to ensure that some
> + * memory is re-dirtied after being transferred, so that
> + * we exercise logic for dirty page handling. We achieve
> + * this with a ridiculosly low bandwidth that guarantees
> + * non-convergance.
> + */
> + migrate_ensure_non_converge(from);
> + } else {
> + /*
> + * Testing non-live migration, we allow it to run at
> + * full speed to ensure short test case duration.
> + * For tests expected to fail, we don't need to
> + * change anything.
> + */
> + if (args->result == MIG_TEST_SUCCEED) {
> + qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
> + if (!got_stop) {
> + qtest_qmp_eventwait(from, "STOP");
> + }
> + migrate_ensure_converge(from);
> + }
> + }
> +
> if (!args->connect_uri) {
> g_autofree char *local_connect_uri =
> migrate_get_socket_address(to, "socket-address");
> @@ -1352,19 +1378,29 @@ static void test_precopy_common(MigrateCommon *args)
> qtest_set_expected_status(to, EXIT_FAILURE);
> }
> } else {
> - wait_for_migration_pass(from);
> + if (args->live) {
> + wait_for_migration_pass(from);
>
> - migrate_ensure_converge(from);
> + migrate_ensure_converge(from);
>
> - /* We do this first, as it has a timeout to stop us
> - * hanging forever if migration didn't converge */
> - wait_for_migration_complete(from);
> + /*
> + * We do this first, as it has a timeout to stop us
> + * hanging forever if migration didn't converge
> + */
> + wait_for_migration_complete(from);
> +
> + if (!got_stop) {
> + qtest_qmp_eventwait(from, "STOP");
> + }
> + } else {
> + wait_for_migration_complete(from);
>
> - if (!got_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> + qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
I retested and the problem still persists. The issue is with this wait +
cont sequence:
wait_for_migration_complete(from);
qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
We wait for the source to finish but by the time qmp_cont executes, the
dst is still INMIGRATE, autostart gets set and I never see the RESUME
event.
When the dst migration finishes the VM gets put in RUN_STATE_PAUSED (at
process_incoming_migration_bh):
if (!global_state_received() ||
global_state_get_runstate() == RUN_STATE_RUNNING) {
if (autostart) {
vm_start();
} else {
runstate_set(RUN_STATE_PAUSED);
}
} else if (migration_incoming_colo_enabled()) {
migration_incoming_disable_colo();
vm_start();
} else {
runstate_set(global_state_get_runstate()); <-- HERE
}
Do we need to add something to that routine like this?
if (autostart &&
global_state_get_runstate() != RUN_STATE_RUNNING) {
vm_start();
}
Otherwise it seems we'll just ignore a 'cont' that was received when the
migration is still ongoing.
> }
>
> - qtest_qmp_eventwait(to, "RESUME");
> + if (!got_resume) {
> + qtest_qmp_eventwait(to, "RESUME");
> + }
>
> wait_for_serial("dest_serial");
> }
next prev parent reply other threads:[~2023-04-24 21:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-21 17:14 [PATCH v2 0/6] tests/qtest: make migration-test massively faster Daniel P. Berrangé
2023-04-21 17:14 ` [PATCH v2 1/6] tests/qtest: replace qmp_discard_response with qtest_qmp_assert_success Daniel P. Berrangé
2023-04-21 21:52 ` Juan Quintela
2023-04-23 2:22 ` Zhang, Chen
2023-04-21 17:14 ` [PATCH v2 2/6] tests/qtests: remove migration test iterations config Daniel P. Berrangé
2023-04-21 21:54 ` Juan Quintela
2023-04-26 9:07 ` Daniel P. Berrangé
2023-04-26 9:42 ` Juan Quintela
2023-04-26 10:15 ` Daniel P. Berrangé
2023-04-21 17:14 ` [PATCH v2 3/6] tests/qtest: capture RESUME events during migration Daniel P. Berrangé
2023-04-21 21:59 ` Juan Quintela
2023-04-24 9:53 ` Daniel P. Berrangé
2023-05-26 11:56 ` Daniel P. Berrangé
2023-04-21 17:14 ` [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live Daniel P. Berrangé
2023-04-21 22:06 ` Juan Quintela
2023-04-24 21:01 ` Fabiano Rosas [this message]
2023-05-26 17:58 ` Daniel P. Berrangé
2023-05-31 12:15 ` Daniel P. Berrangé
2023-04-21 17:14 ` [PATCH v2 5/6] tests/qtest: massively speed up migration-tet Daniel P. Berrangé
2023-04-21 22:15 ` Juan Quintela
2023-04-21 17:14 ` [PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode Daniel P. Berrangé
2023-04-23 2:41 ` Zhang, Chen
2023-04-24 5:58 ` Juan Quintela
2023-04-24 6:56 ` Thomas Huth
2023-04-24 8:05 ` Zhang, Chen
2023-04-24 8:06 ` Zhang, Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87jzy1ro3z.fsf@suse.de \
--to=farosas@suse.de \
--cc=berrange@redhat.com \
--cc=chen.zhang@intel.com \
--cc=jsnow@redhat.com \
--cc=lizhijian@fujitsu.com \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.