qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH v2 8/9] tests/qtest/migration: Add a cancel test
Date: Tue, 11 Feb 2025 18:23:20 -0300	[thread overview]
Message-ID: <87pljouqev.fsf@suse.de> (raw)
In-Reply-To: <Z6urZeOyLYRJzMM8@x1.local>

Peter Xu <peterx@redhat.com> writes:

> On Tue, Feb 11, 2025 at 12:01:35PM -0300, Fabiano Rosas wrote:
>> The qmp_migrate_cancel() command is poorly tested and code inspection
>> reveals that there might be concurrency issues with its usage. Add a
>> test that runs a migration and calls qmp_migrate_cancel() at specific
>> moments.
>> 
>> In order to make the test more deterministic, instead of calling
>> qmp_migrate_cancel() at random moments during migration, do it after
>> the migration status change events are seen.
>> 
>> The expected result is that qmp_migrate_cancel() on the source ends
>> migration on the source with the "cancelled" state and ends migration
>> on the destination with the "failed" state. The only exception is that
>> a failed migration should continue in the failed state.
>> 
>> Cancelling is not allowed during postcopy (no test is added for this
>> because it's a trivial check in the code).
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  tests/qtest/migration/precopy-tests.c | 176 ++++++++++++++++++++++++++
>>  1 file changed, 176 insertions(+)
>> 
>> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
>> index 162fa69531..ba273d10b9 100644
>> --- a/tests/qtest/migration/precopy-tests.c
>> +++ b/tests/qtest/migration/precopy-tests.c
>> @@ -20,6 +20,7 @@
>>  #include "migration/migration-util.h"
>>  #include "ppc-util.h"
>>  #include "qobject/qlist.h"
>> +#include "qapi-types-migration.h"
>>  #include "qemu/module.h"
>>  #include "qemu/option.h"
>>  #include "qemu/range.h"
>> @@ -536,6 +537,161 @@ static void test_multifd_tcp_cancel(void)
>>      migrate_end(from, to2, true);
>>  }
>>  
>> +static void test_cancel_src_after_failed(QTestState *from, QTestState *to,
>> +                                         const char *uri, const char *phase)
>> +{
>> +    /*
>> +     * No migrate_incoming_qmp() at the start to force source into
>> +     * failed state during migrate_qmp().
>> +     */
>> +
>> +    wait_for_serial("src_serial");
>> +    migrate_ensure_converge(from);
>> +
>> +    migrate_qmp(from, to, uri, NULL, "{}");
>> +
>> +    migration_event_wait(from, phase);
>> +    migrate_cancel(from);
>> +
>> +    /* cancelling will not move the migration out of 'failed' */
>> +
>> +    wait_for_migration_status(from, "failed",
>> +                              (const char * []) { "completed", NULL });
>> +
>> +    /*
>> +     * Not waiting for the destination because it never started
>> +     * migration.
>> +     */
>> +}
>> +
>> +static void test_cancel_src_after_cancelled(QTestState *from, QTestState *to,
>> +                                            const char *uri, const char *phase)
>> +{
>> +    migrate_incoming_qmp(to, uri, NULL, "{ 'exit-on-error': false }");
>> +
>> +    wait_for_serial("src_serial");
>> +    migrate_ensure_converge(from);
>> +
>> +    migrate_qmp(from, to, uri, NULL, "{}");
>> +
>> +    /* To move to cancelled/cancelling */
>> +    migrate_cancel(from);
>> +    migration_event_wait(from, phase);
>> +
>> +    /* The migrate_cancel under test */
>> +    migrate_cancel(from);
>> +
>> +    wait_for_migration_status(from, "cancelled",
>> +                              (const char * []) { "completed", NULL });
>> +
>> +    wait_for_migration_status(to, "failed",
>> +                              (const char * []) { "completed", NULL });
>> +}
>> +
>> +static void test_cancel_src_after_complete(QTestState *from, QTestState *to,
>> +                                           const char *uri, const char *phase)
>> +{
>> +    migrate_incoming_qmp(to, uri, NULL, "{ 'exit-on-error': false }");
>> +
>> +    wait_for_serial("src_serial");
>> +    migrate_ensure_converge(from);
>> +
>> +    migrate_qmp(from, to, uri, NULL, "{}");
>> +
>> +    migration_event_wait(from, phase);
>> +    migrate_cancel(from);
>> +
>> +    /*
>> +     * qmp_migrate_cancel() exits early if migration is not running
>> +     * anymore, the status will not change to cancelled.
>> +     */
>> +    wait_for_migration_complete(from);
>> +    wait_for_migration_complete(to);
>> +}
>> +
>> +static void test_cancel_src_after_none(QTestState *from, QTestState *to,
>> +                                       const char *uri, const char *phase)
>> +{
>> +    /*
>> +     * Test that cancelling without a migration happening does not
>> +     * affect subsequent migrations
>> +     */
>> +    migrate_cancel(to);
>> +
>> +    wait_for_serial("src_serial");
>> +    migrate_cancel(from);
>> +
>> +    migrate_incoming_qmp(to, uri, NULL, "{ 'exit-on-error': false }");
>> +
>> +    migrate_ensure_converge(from);
>> +    migrate_qmp(from, to, uri, NULL, "{}");
>> +
>> +    wait_for_migration_complete(from);
>> +    wait_for_migration_complete(to);
>> +}
>> +
>> +static void test_cancel_src_pre_switchover(QTestState *from, QTestState *to,
>> +                                           const char *uri, const char *phase)
>> +{
>> +    migrate_set_capability(from, "pause-before-switchover", true);
>> +    migrate_set_capability(to, "pause-before-switchover", true);
>> +
>> +    migrate_set_capability(from, "multifd", true);
>> +    migrate_set_capability(to, "multifd", true);
>> +
>> +    migrate_incoming_qmp(to, uri, NULL, "{ 'exit-on-error': false }");
>> +
>> +    wait_for_serial("src_serial");
>> +    migrate_ensure_converge(from);
>> +
>> +    migrate_qmp(from, to, uri, NULL, "{}");
>> +
>> +    migration_event_wait(from, phase);
>> +    migrate_cancel(from);
>> +    migration_event_wait(from, "cancelling");
>> +
>> +    wait_for_migration_status(from, "cancelled",
>> +                              (const char * []) { "completed", NULL });
>> +
>> +    wait_for_migration_status(to, "failed",
>> +                              (const char * []) { "completed", NULL });
>> +}
>> +
>> +static void test_cancel_src_after_status(void *opaque)
>> +{
>> +    const char *test_path = opaque;
>> +    g_autofree char *phase = g_path_get_basename(test_path);
>> +    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
>> +    QTestState *from, *to;
>> +    MigrateStart args = {
>> +        .hide_stderr = true,
>> +    };
>> +
>> +    if (migrate_start(&from, &to, "defer", &args)) {
>> +        return;
>> +    }
>> +
>> +    if (g_str_equal(phase, "cancelling") ||
>> +        g_str_equal(phase, "cancelled")) {
>> +        test_cancel_src_after_cancelled(from, to, uri, phase);
>> +
>> +    } else if (g_str_equal(phase, "completed")) {
>> +        test_cancel_src_after_complete(from, to, uri, phase);
>> +
>> +    } else if (g_str_equal(phase, "failed")) {
>> +        test_cancel_src_after_failed(from, to, uri, phase);
>> +
>> +    } else if (g_str_equal(phase, "none")) {
>> +        test_cancel_src_after_none(from, to, uri, phase);
>> +
>> +    } else {
>> +        /* any state that comes before pre-switchover */
>> +        test_cancel_src_pre_switchover(from, to, uri, phase);
>
> [1]
>
>> +    }
>> +
>> +    migrate_end(from, to, false);
>> +}
>
> I'm OK with the current status, considering it at least enlarge our cancel
> testcases so definitely good to have:
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Though one thing to mention is the new "test_full()" abstraction doesn't
> yet look like to benefit us a huge lot, IMHO.
>
> These are the new tests:
>
> # /x86_64/migration/cancel/src/after/none
> # /x86_64/migration/cancel/src/after/setup             [*]
> # /x86_64/migration/cancel/src/after/cancelling
> # /x86_64/migration/cancel/src/after/cancelled
> # /x86_64/migration/cancel/src/after/active
> # /x86_64/migration/cancel/src/after/completed
> # /x86_64/migration/cancel/src/after/failed
> # /x86_64/migration/cancel/src/after/pre-switchover    [*]
>
> We have only one abstracted path [1] to test random status, but that so far
> only covers two cases marked with [*].  It is hard to say whether the
> abstraction is necessary, or maybe it's easier we always register separate
> test cases.  So it's still slightly debatable whether we could make all
> above "if .. if else .. if else ... else" into separate tests.
>

It gets super boilerplatey:


    for (int i = MIGRATION_STATUS_NONE; i < MIGRATION_STATUS__MAX; i++) {
        switch (i) {
        case MIGRATION_STATUS_DEVICE:          /* happens too fast */
        case MIGRATION_STATUS_WAIT_UNPLUG:     /* no support in tests */
        case MIGRATION_STATUS_COLO:            /* no support in tests */
        case MIGRATION_STATUS_POSTCOPY_ACTIVE: /* postcopy can't be cancelled */
        case MIGRATION_STATUS_POSTCOPY_PAUSED:
        case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
        case MIGRATION_STATUS_POSTCOPY_RECOVER:
            continue;
        case MIGRATION_STATUS_NONE:
            migration_test_add("/migration/cancel/src/after/none",
                               test_cancel_src_after_none);
            break;
        case MIGRATION_STATUS_SETUP:
            migration_test_add("/migration/cancel/src/after/setup",
                               test_cancel_src_after_setup);
            break;
        case MIGRATION_STATUS_CANCELLING:
            migration_test_add("/migration/cancel/src/after/cancelling",
                               test_cancel_src_after_cancelling);
            break;
        case MIGRATION_STATUS_CANCELLED:
            migration_test_add("/migration/cancel/src/after/cancelled",
                               test_cancel_src_after_cancelled);
            break;
        case MIGRATION_STATUS_ACTIVE:
            migration_test_add("/migration/cancel/src/after/active",
                               test_cancel_src_after_active);
            break;
        case MIGRATION_STATUS_COMPLETED:
            migration_test_add("/migration/cancel/src/after/completed",
                               test_cancel_src_after_completed);
            break;
        case MIGRATION_STATUS_FAILED:
            migration_test_add("/migration/cancel/src/after/failed",
                               test_cancel_src_after_failed);
            break;
        case MIGRATION_STATUS_PRE_SWITCHOVER:
            migration_test_add("/migration/cancel/src/after/pre-switchover",
                               test_cancel_src_after_pre_switchover);
            break;
        }
    }

}

void test_cancel_src_after_cancelling(void)
{
    test_cancel_src_after_cancel("cancelling");
}

void test_cancel_src_after_cancelled(void)
{
    test_cancel_src_after_cancel("cancelled");
}

void test_cancel_src_after_setup(void)
{
    test_cancel_src_after("setup");
}

void test_cancel_src_after_active(void)
{
    test_cancel_src_after("active");
}

void test_cancel_src_after_pre_switchover(void)
{
    test_cancel_src_after("pre-switchover");
}

static void test_cancel_src_after_failed(void)
{
    ...
    migration_event_wait(from, "failed");
    ...
}

static void test_cancel_src_after_cancel(const char *phase)
{
    ...    
    migration_event_wait(from, phase);
    ...
}

static void test_cancel_src_after_complete(void)
{
    migration_event_wait(from, "complete");
    ...
}

static void test_cancel_src_after_none(void)
{
    ...
}

static void test_cancel_src_after(const char *phase)
{
   ...
   migration_event_wait(from, phase);
   ...
}


  reply	other threads:[~2025-02-11 21:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11 15:01 [PATCH v2 0/9] migration: Fix issues during qmp_migrate_cancel Fabiano Rosas
2025-02-11 15:01 ` [PATCH v2 1/9] migration: Set migration error outside of migrate_cancel Fabiano Rosas
2025-02-11 17:26   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 2/9] migration: Unify migration_cancel and migrate_fd_cancel Fabiano Rosas
2025-02-11 17:26   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 3/9] migration: Change migrate_fd_ to migration_ Fabiano Rosas
2025-02-11 17:32   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 4/9] migration: Fix hang after error in destination setup phase Fabiano Rosas
2025-02-11 17:33   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 5/9] migration: Reject qmp_migrate_cancel after postcopy Fabiano Rosas
2025-02-11 17:34   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 6/9] migration: Don't set FAILED state when cancelling Fabiano Rosas
2025-02-11 17:46   ` Peter Xu
2025-02-11 18:04     ` Fabiano Rosas
2025-02-11 19:43       ` Peter Xu
2025-02-11 20:22         ` Fabiano Rosas
2025-02-11 15:01 ` [PATCH v2 7/9] tests/qtest/migration: Introduce migration_test_add_suffix Fabiano Rosas
2025-02-11 19:50   ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 8/9] tests/qtest/migration: Add a cancel test Fabiano Rosas
2025-02-11 19:56   ` Peter Xu
2025-02-11 21:23     ` Fabiano Rosas [this message]
2025-02-11 21:31       ` Peter Xu
2025-02-11 15:01 ` [PATCH v2 9/9] migration: Update migrate_cancel documentation Fabiano Rosas
2025-02-11 16:37   ` Markus Armbruster
2025-02-11 19:56   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pljouqev.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=berrange@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).