From: Fabiano Rosas <farosas@suse.de>
To: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>,
Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>,
qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, hreitz@redhat.com, kwolf@redhat.com,
eesposit@redhat.com, den@virtuozzo.com,
Peter Xu <peterx@redhat.com>
Subject: Re: [Bug Report][RFC PATCH 1/1] block: fix failing assert on paused VM migration
Date: Thu, 10 Oct 2024 16:21:50 -0300 [thread overview]
Message-ID: <87a5fbsr01.fsf@suse.de> (raw)
In-Reply-To: <a2a13861-ed60-4fc7-b57b-51028179406c@yandex-team.ru>
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> On 10.10.24 17:30, Fabiano Rosas wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>>
>>> On 09.10.24 23:53, Fabiano Rosas wrote:
>>>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>>>>
>>>>> On 30.09.24 17:07, Andrey Drobyshev wrote:
>>>>>> On 9/30/24 12:25 PM, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> [add migration maintainers]
>>>>>>>
>>>>>>> On 24.09.24 15:56, Andrey Drobyshev wrote:
>>>>>>>> [...]
>>>>>>>
>>>>>>> I doubt that this a correct way to go.
>>>>>>>
>>>>>>> As far as I understand, "inactive" actually means that "storage is not
>>>>>>> belong to qemu, but to someone else (another qemu process for example),
>>>>>>> and may be changed transparently". In turn this means that Qemu should
>>>>>>> do nothing with inactive disks. So the problem is that nobody called
>>>>>>> bdrv_activate_all on target, and we shouldn't ignore that.
>>>>>>>
>>>>>>> Hmm, I see in process_incoming_migration_bh() we do call
>>>>>>> bdrv_activate_all(), but only in some scenarios. May be, the condition
>>>>>>> should be less strict here.
>>>>>>>
>>>>>>> Why we need any condition here at all? Don't we want to activate
>>>>>>> block-layer on target after migration anyway?
>>>>>>>
>>>>>>
>>>>>> Hmm I'm not sure about the unconditional activation, since we at least
>>>>>> have to honor LATE_BLOCK_ACTIVATE cap if it's set (and probably delay it
>>>>>> in such a case). In current libvirt upstream I see such code:
>>>>>>
>>>>>>> /* Migration capabilities which should always be enabled as long as they
>>>>>>> * are supported by QEMU. If the capability is supposed to be enabled on both
>>>>>>> * sides of migration, it won't be enabled unless both sides support it.
>>>>>>> */
>>>>>>> static const qemuMigrationParamsAlwaysOnItem qemuMigrationParamsAlwaysOn[] = {
>>>>>>> {QEMU_MIGRATION_CAP_PAUSE_BEFORE_SWITCHOVER,
>>>>>>> QEMU_MIGRATION_SOURCE},
>>>>>>>
>>>>>>> {QEMU_MIGRATION_CAP_LATE_BLOCK_ACTIVATE,
>>>>>>> QEMU_MIGRATION_DESTINATION},
>>>>>>> };
>>>>>>
>>>>>> which means that libvirt always wants LATE_BLOCK_ACTIVATE to be set.
>>>>>>
>>>>>> The code from process_incoming_migration_bh() you're referring to:
>>>>>>
>>>>>>> /* If capability late_block_activate is set:
>>>>>>> * Only fire up the block code now if we're going to restart the
>>>>>>> * VM, else 'cont' will do it.
>>>>>>> * This causes file locking to happen; so we don't want it to happen
>>>>>>> * unless we really are starting the VM.
>>>>>>> */
>>>>>>> if (!migrate_late_block_activate() ||
>>>>>>> (autostart && (!global_state_received() ||
>>>>>>> runstate_is_live(global_state_get_runstate())))) {
>>>>>>> /* Make sure all file formats throw away their mutable metadata.
>>>>>>> * If we get an error here, just don't restart the VM yet. */
>>>>>>> bdrv_activate_all(&local_err);
>>>>>>> if (local_err) {
>>>>>>> error_report_err(local_err);
>>>>>>> local_err = NULL;
>>>>>>> autostart = false;
>>>>>>> }
>>>>>>> }
>>>>>>
>>>>>> It states explicitly that we're either going to start VM right at this
>>>>>> point if (autostart == true), or we wait till "cont" command happens.
>>>>>> None of this is going to happen if we start another migration while
>>>>>> still being in PAUSED state. So I think it seems reasonable to take
>>>>>> such case into account. For instance, this patch does prevent the crash:
>>>>>>
>>>>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>>>>> index ae2be31557..3222f6745b 100644
>>>>>>> --- a/migration/migration.c
>>>>>>> +++ b/migration/migration.c
>>>>>>> @@ -733,7 +733,8 @@ static void process_incoming_migration_bh(void *opaque)
>>>>>>> */
>>>>>>> if (!migrate_late_block_activate() ||
>>>>>>> (autostart && (!global_state_received() ||
>>>>>>> - runstate_is_live(global_state_get_runstate())))) {
>>>>>>> + runstate_is_live(global_state_get_runstate()))) ||
>>>>>>> + (!autostart && global_state_get_runstate() == RUN_STATE_PAUSED)) {
>>>>>>> /* Make sure all file formats throw away their mutable metadata.
>>>>>>> * If we get an error here, just don't restart the VM yet. */
>>>>>>> bdrv_activate_all(&local_err);
>>>>>>
>>>>>> What are your thoughts on it?
>>>>>>
>>>>
>>>> This bug is the same as https://gitlab.com/qemu-project/qemu/-/issues/2395
>>>>
>>>>>
>>>>> Hmmm... Don't we violate "late-block-activate" contract by this?
>>>>>
>>>>> Me go and check:
>>>>>
>>>>> # @late-block-activate: If enabled, the destination will not activate
>>>>> # block devices (and thus take locks) immediately at the end of
>>>>> # migration. (since 3.0)
>>>>>
>>>>> Yes, we'll violate it by this patch. So, for now the only exception is
>>>>> when autostart is enabled, but libvirt correctly use
>>>>> late-block-activate + !autostart.
>>>>>
>>>>> Interesting, when block layer is assumed to be activated.. Aha, only in qmp_cont().
>>>>>
>>>>>
>>>>> So, what to do with this all:
>>>>>
>>>>> Either libvirt should not use late-block-activate for migration of
>>>>> stopped vm. This way target would be automatically activated
>>>>>
>>>>> Or if libvirt still need postponed activation (I assume, for correctly
>>>>> switching shared disks, etc), Libvirt should do some additional QMP
>>>>> call. It can't be "cont", if we don't want to run the VM. So,
>>>>> probably, we need additional "block-activate" QMP command for this.
>>>>
>>>> A third option might be to unconditionally activate in qmp_migrate:
>>>
>>> Yes. But is migration the only operation with vm which requires block
>>> layer be activated? I think actually a lot of operation require
>>> that.. Any block-layer releated qmp command actually. And do automatic
>>> activation in all of them I think is a wrong way.
>>
>> Yes, good point. I don't know how other commands behave in this
>> situation. It would be good to have an unified solution. I'll check.
>>
>>>
>>> Moreover, if we have explicit possibility to "postpone activation", we
>>> should provide a way to "activate by hand".
>>
>> Maybe, but it doesn't really follows. We have been activating
>> automatically until now, after all (from qmp_cont). Also, having to go
>> change libvirt code just for this is not ideal.
>>
>>>
>>> So I still think correct fix is reporting error from qmp_migrate when
>>> block-layer is inactive, and add some possibility to activate through
>>> QMP.
>>
>> Unfortunately, for migration that's bad user experience: we allow the
>> first migration of a paused VM with no issues, then on the second one we
>> error out asking for a command to be run, which only does a
>> bdrv_activate_all() that QEMU could very well do itself.
>>
>
> Hmm. Now I tend to agree that it's OK to activate block-layer on "migrate" command automatically.
>
> Full solution should consider also another block-layer related qmp
> commands, but that would require a lot more work.
So far, only bdrv_co_write_req_prepare() seems to have a problem. All
other asserts are due to the process of activation/inactivation itself.
I can only assume that paths checking the flag and doing some operation
based on it are still going to work as expected.
@Andrey, do you want to send a patch with the migration fix? I can work
on the block layer stuff.
prev parent reply other threads:[~2024-10-10 19:22 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 12:56 [Bug Report][RFC PATCH 0/1] block: fix failing assert on paused VM migration Andrey Drobyshev
2024-09-24 12:56 ` [Bug Report][RFC PATCH 1/1] " Andrey Drobyshev
2024-09-30 9:25 ` Vladimir Sementsov-Ogievskiy
2024-09-30 14:07 ` Andrey Drobyshev
2024-10-01 9:54 ` Vladimir Sementsov-Ogievskiy
2024-10-09 20:53 ` Fabiano Rosas
2024-10-10 6:48 ` Vladimir Sementsov-Ogievskiy
2024-10-10 14:30 ` Fabiano Rosas
2024-10-10 15:42 ` Vladimir Sementsov-Ogievskiy
2024-10-10 19:21 ` Fabiano Rosas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a5fbsr01.fsf@suse.de \
--to=farosas@suse.de \
--cc=andrey.drobyshev@virtuozzo.com \
--cc=den@virtuozzo.com \
--cc=eesposit@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).