From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
Date: Mon, 09 Oct 2023 13:50:08 -0300 [thread overview]
Message-ID: <87wmvveo5b.fsf@suse.de> (raw)
In-Reply-To: <87zg0wenkg.fsf@suse.de>
Fabiano Rosas <farosas@suse.de> writes:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
>>> >> >> + /*
>>> >> >> + * Make sure both QEMU instances will go into RECOVER stage, then test
>>> >> >> + * kicking them out using migrate-pause.
>>> >> >> + */
>>> >> >> + wait_for_postcopy_status(from, "postcopy-recover");
>>> >> >> + wait_for_postcopy_status(to, "postcopy-recover");
>>> >> >
>>> >> > Is this wait out of place? I think we're trying to resume too fast after
>>> >> > migrate_recover():
>>> >> >
>>> >> > # {
>>> >> > # "error": {
>>> >> > # "class": "GenericError",
>>> >> > # "desc": "Cannot resume if there is no paused migration"
>>> >> > # }
>>> >> > # }
>>> >> >
>>> >>
>>> >> Ugh, sorry about the long lines:
>>> >>
>>> >> {
>>> >> "error": {
>>> >> "class": "GenericError",
>>> >> "desc": "Cannot resume if there is no paused migration"
>>> >> }
>>> >> }
>>> >
>>> > Sorry I didn't get you here. Could you elaborate your question?
>>> >
>>>
>>> The test is sometimes failing with the above message.
>>>
>>> But indeed my question doesn't make sense. I forgot migrate_recover
>>> happens on the destination. Nevermind.
>>>
>>> The bug is still present nonetheless. We're going into migrate_prepare
>>> in some state other than POSTCOPY_PAUSED.
>>
>> Oh I see. Interestingly I cannot reproduce on my host, just like last
>> time..
>>
>> What is your setup for running the test? Anything special? Here's my
>> cmdline:
>
> The crudest oneliner:
>
> for i in $(seq 1 9999); do echo "$i ============="; \
> QTEST_QEMU_BINARY=./qemu-system-x86_64 \
> ./tests/qtest/migration-test -r /x86_64/migration/postcopy/recovery || break ; done
>
> I suspect my system has something specific to it that affects the timing
> of the tests. But I have no idea what it could be.
>
> $ lscpu
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Address sizes: 39 bits physical, 48 bits virtual
> Byte Order: Little Endian
> CPU(s): 16
> On-line CPU(s) list: 0-15
> Vendor ID: GenuineIntel
> Model name: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
> CPU family: 6
> Model: 141
> Thread(s) per core: 2
> Core(s) per socket: 8
> Socket(s): 1
> Stepping: 1
> CPU max MHz: 4800.0000
> CPU min MHz: 800.0000
> BogoMIPS: 4992.00
>
>>
>> $ cat reproduce.sh
>> index=$1
>> loop=0
>>
>> while :; do
>> echo "Starting loop=$loop..."
>> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery/double-failures
>> if [[ $? != 0 ]]; then
>> echo "index $index REPRODUCED (loop=$loop) !"
>> break
>> fi
>> loop=$(( loop + 1 ))
>> done
>>
>> Survives 200+ loops and kept going.
>>
>> However I think I saw what's wrong here, could you help try below fixup?
>>
>
> Sure. I won't get to it until tomorrow though.
It seems to have fixed the issue. 3500 iterations and still going.
next prev parent reply other threads:[~2023-10-09 16:51 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
2023-10-05 7:28 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
2023-10-05 7:30 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
2023-10-05 6:11 ` Philippe Mathieu-Daudé
2023-10-05 16:05 ` Peter Xu
2023-10-08 11:39 ` Philippe Mathieu-Daudé
2023-10-05 8:22 ` Juan Quintela
2023-10-05 19:35 ` Peter Xu
2023-10-05 12:57 ` Fabiano Rosas
2023-10-05 19:35 ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
2023-10-05 7:32 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
2023-10-05 7:42 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
2023-10-05 7:43 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
2023-10-05 7:49 ` Juan Quintela
2023-10-05 20:47 ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
2023-10-05 13:25 ` Fabiano Rosas
2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
2023-10-05 8:24 ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
2023-10-05 13:24 ` Fabiano Rosas
2023-10-05 13:37 ` Fabiano Rosas
2023-10-05 20:55 ` Peter Xu
2023-10-05 21:10 ` Fabiano Rosas
2023-10-05 21:44 ` Peter Xu
2023-10-05 22:01 ` Fabiano Rosas
2023-10-09 16:50 ` Fabiano Rosas [this message]
2023-10-10 16:00 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wmvveo5b.fsf@suse.de \
--to=farosas@suse.de \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.