All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, qemu-devel@nongnu.org
Cc: peterx@redhat.com, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
Date: Thu, 05 Oct 2023 10:37:56 -0300	[thread overview]
Message-ID: <878r8hfavf.fsf@suse.de> (raw)
In-Reply-To: <87edi9fbh5.fsf@suse.de>

Fabiano Rosas <farosas@suse.de> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> From: Fabiano Rosas <farosas@suse.de>
>>
>> To do so, create two paired sockets, but make them not providing real data.
>> Feed those fake sockets to src/dst QEMUs for recovery to let them go into
>> RECOVER stage without going out.  Test that we can always kick it out and
>> recover again with the right ports.
>>
>> This patch is based on Fabiano's version here:
>>
>> https://lore.kernel.org/r/877cowmdu0.fsf@suse.de
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> [peterx: write commit message, remove case 1, fix bugs, and more]
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  tests/qtest/migration-test.c | 94 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 94 insertions(+)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 46f1c275a2..fb7a3765e4 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -729,6 +729,7 @@ typedef struct {
>>      /* Postcopy specific fields */
>>      void *postcopy_data;
>>      bool postcopy_preempt;
>> +    bool postcopy_recovery_test_fail;
>>  } MigrateCommon;
>>  
>>  static int test_migrate_start(QTestState **from, QTestState **to,
>> @@ -1381,6 +1382,78 @@ static void test_postcopy_preempt_tls_psk(void)
>>  }
>>  #endif
>>  
>> +static void wait_for_postcopy_status(QTestState *one, const char *status)
>> +{
>> +    wait_for_migration_status(one, status,
>> +                              (const char * []) { "failed", "active",
>> +                                                  "completed", NULL });
>> +}
>> +
>> +static void postcopy_recover_fail(QTestState *from, QTestState *to)
>> +{
>> +    int ret, pair1[2], pair2[2];
>> +    char c;
>> +
>> +    /* Create two unrelated socketpairs */
>> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair1);
>> +    g_assert_cmpint(ret, ==, 0);
>> +
>> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair2);
>> +    g_assert_cmpint(ret, ==, 0);
>> +
>> +    /*
>> +     * Give the guests unpaired ends of the sockets, so they'll all blocked
>> +     * at reading.  This mimics a wrong channel established.
>> +     */
>> +    qtest_qmp_fds_assert_success(from, &pair1[0], 1,
>> +                                 "{ 'execute': 'getfd',"
>> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
>> +    qtest_qmp_fds_assert_success(to, &pair2[0], 1,
>> +                                 "{ 'execute': 'getfd',"
>> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
>> +
>> +    /*
>> +     * Write the 1st byte as QEMU_VM_COMMAND (0x8) for the dest socket, to
>> +     * emulate the 1st byte of a real recovery, but stops from there to
>> +     * keep dest QEMU in RECOVER.  This is needed so that we can kick off
>> +     * the recover process on dest QEMU (by triggering the G_IO_IN event).
>> +     *
>> +     * NOTE: this trick is not needed on src QEMUs, because src doesn't
>> +     * rely on an pre-existing G_IO_IN event, so it will always trigger the
>> +     * upcoming recovery anyway even if it can read nothing.
>> +     */
>> +#define QEMU_VM_COMMAND              0x08
>> +    c = QEMU_VM_COMMAND;
>> +    ret = send(pair2[1], &c, 1, 0);
>> +    g_assert_cmpint(ret, ==, 1);
>> +
>> +    migrate_recover(to, "fd:fd-mig");
>> +    migrate_qmp(from, "fd:fd-mig", "{'resume': true}");
>> +
>> +    /*
>> +     * Make sure both QEMU instances will go into RECOVER stage, then test
>> +     * kicking them out using migrate-pause.
>> +     */
>> +    wait_for_postcopy_status(from, "postcopy-recover");
>> +    wait_for_postcopy_status(to, "postcopy-recover");
>
> Is this wait out of place? I think we're trying to resume too fast after
> migrate_recover():
>
> # {
> #     "error": {
> #         "class": "GenericError",
> #         "desc": "Cannot resume if there is no paused migration"
> #     }
> # }
>

Ugh, sorry about the long lines:

{
    "error": {
        "class": "GenericError",
        "desc": "Cannot resume if there is no paused migration"
    }
}



  reply	other threads:[~2023-10-05 13:38 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
2023-10-05  7:28   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
2023-10-05  7:30   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
2023-10-05  6:11   ` Philippe Mathieu-Daudé
2023-10-05 16:05     ` Peter Xu
2023-10-08 11:39       ` Philippe Mathieu-Daudé
2023-10-05  8:22   ` Juan Quintela
2023-10-05 19:35     ` Peter Xu
2023-10-05 12:57   ` Fabiano Rosas
2023-10-05 19:35     ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
2023-10-05  7:32   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
2023-10-05  7:42   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
2023-10-05  7:43   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
2023-10-05  7:49   ` Juan Quintela
2023-10-05 20:47     ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
2023-10-05 13:25   ` Fabiano Rosas
2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
2023-10-05  8:24   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
2023-10-05 13:24   ` Fabiano Rosas
2023-10-05 13:37     ` Fabiano Rosas [this message]
2023-10-05 20:55       ` Peter Xu
2023-10-05 21:10         ` Fabiano Rosas
2023-10-05 21:44           ` Peter Xu
2023-10-05 22:01             ` Fabiano Rosas
2023-10-09 16:50               ` Fabiano Rosas
2023-10-10 16:00                 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r8hfavf.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.