public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
* [PATCH v2] vmstate: fix subsection load name check
@ 2026-03-02  7:06 Alexandr Moshkov
  2026-03-02 17:36 ` Peter Xu
  2026-03-06 13:46 ` Fabiano Rosas
  0 siblings, 2 replies; 15+ messages in thread
From: Alexandr Moshkov @ 2026-03-02  7:06 UTC (permalink / raw)
  To: qemu-devel
  Cc: yc-core@yandex-team.ru, Peter Xu, Fabiano Rosas, Alexandr Moshkov

When loading a subset, its name is checked for the parent prefix. The
following bug may occur here:

Let's say there is a vmstate named "virtio-blk", it has a subsection
named "virtio-blk/subsection", and it also has another vmstate named
"virtio" in the fields.
Then, during the migration, when trying to load this subsection for
"virtio", the prefix condition will pass for "virtio-blk/subsection" and
then the migration will break, because this vmstate does not have such a
subsection.

In other words, if a field inside vmstate1 is set via vmstate2 with a
name that is a prefix of the parent vmstate, then the field can "steal"
a subsection belonging to the parent state.

Looks like it happens because migration stream for "virtio-blk" looks
like this:

 [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]

"virtio-blk" contains "virtio" field, so migration stream is:

 [virtio-blk header] [virtio header] [virtio fields] [virtio
subsections] [virtio-blk subsections]

And when we load the subsections of the "virtio" device,
vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if
this is his subsection. This is where we encounter an error.

Thus, the error occurs due to the fact that vmsd does not know how many
subsections it has when loading (this does not appear anywhere in the
migration stream), so it tries to load all the appropriate ones by
names.

Fix it by checking `/` at the end of idstr.

Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
---
 migration/vmstate.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index 4d28364f7b..187f3861f2 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
 
     while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
         char idstr[256], *idstr_ret;
-        int ret;
+        int ret, vmsd_name_len;
         uint8_t version_id, len, size;
         const VMStateDescription *sub_vmsd;
 
@@ -631,7 +631,10 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
         memcpy(idstr, idstr_ret, size);
         idstr[size] = 0;
 
-        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
+        vmsd_name_len = strlen(vmsd->name);
+        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
+            /* to avoid taking parent subsection here */
+            idstr[vmsd_name_len] != '/') {
             trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
             /* it doesn't have a valid subsection name */
             return 0;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-02  7:06 [PATCH v2] vmstate: fix subsection load name check Alexandr Moshkov
@ 2026-03-02 17:36 ` Peter Xu
  2026-03-06 13:46 ` Fabiano Rosas
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Xu @ 2026-03-02 17:36 UTC (permalink / raw)
  To: Alexandr Moshkov; +Cc: qemu-devel, yc-core@yandex-team.ru, Fabiano Rosas

On Mon, Mar 02, 2026 at 12:06:26PM +0500, Alexandr Moshkov wrote:
> When loading a subset, its name is checked for the parent prefix. The
> following bug may occur here:
> 
> Let's say there is a vmstate named "virtio-blk", it has a subsection
> named "virtio-blk/subsection", and it also has another vmstate named
> "virtio" in the fields.
> Then, during the migration, when trying to load this subsection for
> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
> then the migration will break, because this vmstate does not have such a
> subsection.
> 
> In other words, if a field inside vmstate1 is set via vmstate2 with a
> name that is a prefix of the parent vmstate, then the field can "steal"
> a subsection belonging to the parent state.
> 
> Looks like it happens because migration stream for "virtio-blk" looks
> like this:
> 
>  [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
> 
> "virtio-blk" contains "virtio" field, so migration stream is:
> 
>  [virtio-blk header] [virtio header] [virtio fields] [virtio
> subsections] [virtio-blk subsections]
> 
> And when we load the subsections of the "virtio" device,
> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if
> this is his subsection. This is where we encounter an error.
> 
> Thus, the error occurs due to the fact that vmsd does not know how many
> subsections it has when loading (this does not appear anywhere in the
> migration stream), so it tries to load all the appropriate ones by
> names.
> 
> Fix it by checking `/` at the end of idstr.
> 
> Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-02  7:06 [PATCH v2] vmstate: fix subsection load name check Alexandr Moshkov
  2026-03-02 17:36 ` Peter Xu
@ 2026-03-06 13:46 ` Fabiano Rosas
  2026-03-10  5:32   ` Alexandr Moshkov
  1 sibling, 1 reply; 15+ messages in thread
From: Fabiano Rosas @ 2026-03-06 13:46 UTC (permalink / raw)
  To: Alexandr Moshkov, qemu-devel
  Cc: yc-core@yandex-team.ru, Peter Xu, Alexandr Moshkov

Alexandr Moshkov <dtalexundeer@yandex-team.ru> writes:

> When loading a subset, its name is checked for the parent prefix. The
> following bug may occur here:
>
> Let's say there is a vmstate named "virtio-blk", it has a subsection
> named "virtio-blk/subsection", and it also has another vmstate named
> "virtio" in the fields.
> Then, during the migration, when trying to load this subsection for
> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
> then the migration will break, because this vmstate does not have such a
> subsection.
>
> In other words, if a field inside vmstate1 is set via vmstate2 with a
> name that is a prefix of the parent vmstate, then the field can "steal"
> a subsection belonging to the parent state.
>
> Looks like it happens because migration stream for "virtio-blk" looks
> like this:
>
>  [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
>
> "virtio-blk" contains "virtio" field, so migration stream is:
>
>  [virtio-blk header] [virtio header] [virtio fields] [virtio
> subsections] [virtio-blk subsections]
>
> And when we load the subsections of the "virtio" device,
> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if
> this is his subsection. This is where we encounter an error.
>
> Thus, the error occurs due to the fact that vmsd does not know how many
> subsections it has when loading (this does not appear anywhere in the
> migration stream), so it tries to load all the appropriate ones by
> names.
>
> Fix it by checking `/` at the end of idstr.
>
> Signed-off-by: Alexandr Moshkov <dtalexundeer@yandex-team.ru>
> ---
>  migration/vmstate.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/migration/vmstate.c b/migration/vmstate.c
> index 4d28364f7b..187f3861f2 100644
> --- a/migration/vmstate.c
> +++ b/migration/vmstate.c
> @@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>  
>      while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>          char idstr[256], *idstr_ret;
> -        int ret;
> +        int ret, vmsd_name_len;
>          uint8_t version_id, len, size;
>          const VMStateDescription *sub_vmsd;
>  
> @@ -631,7 +631,10 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>          memcpy(idstr, idstr_ret, size);
>          idstr[size] = 0;
>  
> -        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
> +        vmsd_name_len = strlen(vmsd->name);
> +        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
> +            /* to avoid taking parent subsection here */
> +            idstr[vmsd_name_len] != '/') {
>              trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
>              /* it doesn't have a valid subsection name */
>              return 0;

Hi, this breaks migration for s390x and ppc64:

qemu-system-s390x: Missing section footer for qemu-s390-flic

qemu-system-ppc64: Missing section footer for spapr_iommu


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-06 13:46 ` Fabiano Rosas
@ 2026-03-10  5:32   ` Alexandr Moshkov
  2026-03-10 13:27     ` Alexandr Moshkov
  0 siblings, 1 reply; 15+ messages in thread
From: Alexandr Moshkov @ 2026-03-10  5:32 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel; +Cc: yc-core@yandex-team.ru, Peter Xu

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]


On 3/6/26 18:46, Fabiano Rosas wrote:
> Alexandr Moshkov<dtalexundeer@yandex-team.ru> writes:
>
>> When loading a subset, its name is checked for the parent prefix. The
>> following bug may occur here:
>>
>> Let's say there is a vmstate named "virtio-blk", it has a subsection
>> named "virtio-blk/subsection", and it also has another vmstate named
>> "virtio" in the fields.
>> Then, during the migration, when trying to load this subsection for
>> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
>> then the migration will break, because this vmstate does not have such a
>> subsection.
>>
>> In other words, if a field inside vmstate1 is set via vmstate2 with a
>> name that is a prefix of the parent vmstate, then the field can "steal"
>> a subsection belonging to the parent state.
>>
>> Looks like it happens because migration stream for "virtio-blk" looks
>> like this:
>>
>>   [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
>>
>> "virtio-blk" contains "virtio" field, so migration stream is:
>>
>>   [virtio-blk header] [virtio header] [virtio fields] [virtio
>> subsections] [virtio-blk subsections]
>>
>> And when we load the subsections of the "virtio" device,
>> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if
>> this is his subsection. This is where we encounter an error.
>>
>> Thus, the error occurs due to the fact that vmsd does not know how many
>> subsections it has when loading (this does not appear anywhere in the
>> migration stream), so it tries to load all the appropriate ones by
>> names.
>>
>> Fix it by checking `/` at the end of idstr.
>>
>> Signed-off-by: Alexandr Moshkov<dtalexundeer@yandex-team.ru>
>> ---
>>   migration/vmstate.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/migration/vmstate.c b/migration/vmstate.c
>> index 4d28364f7b..187f3861f2 100644
>> --- a/migration/vmstate.c
>> +++ b/migration/vmstate.c
>> @@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>>   
>>       while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>>           char idstr[256], *idstr_ret;
>> -        int ret;
>> +        int ret, vmsd_name_len;
>>           uint8_t version_id, len, size;
>>           const VMStateDescription *sub_vmsd;
>>   
>> @@ -631,7 +631,10 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>>           memcpy(idstr, idstr_ret, size);
>>           idstr[size] = 0;
>>   
>> -        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
>> +        vmsd_name_len = strlen(vmsd->name);
>> +        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
>> +            /* to avoid taking parent subsection here */
>> +            idstr[vmsd_name_len] != '/') {
>>               trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
>>               /* it doesn't have a valid subsection name */
>>               return 0;
> Hi, this breaks migration for s390x and ppc64:
>
> qemu-system-s390x: Missing section footer for qemu-s390-flic

Thanks for reply! It happens because "qemu-s390-flic" vmsd has 
"qemu-s390-flic-full" subsection. I'm not sure if we can change the 
names of the existing subsections. Peter, what do you think?

It seemsto be a biggerproblemthatthe 
migrationframeworkdoesnotdocumentthe namesforsubsections in any 
way.Atthe same time, the codeimpliesthat atleastthe namesshould bea 
substringwithitsparent. There is even such a code comment `subsection 
name has to be "section_name/a"`, that seemsto implythe presenceof /.

[-- Attachment #2: Type: text/html, Size: 8620 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-10  5:32   ` Alexandr Moshkov
@ 2026-03-10 13:27     ` Alexandr Moshkov
  2026-03-10 13:39       ` Peter Maydell
  0 siblings, 1 reply; 15+ messages in thread
From: Alexandr Moshkov @ 2026-03-10 13:27 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: yc-core@yandex-team.ru, Fabiano Rosas

[-- Attachment #1: Type: text/plain, Size: 4081 bytes --]


On 3/10/26 10:32, Alexandr Moshkov wrote:
>
>
> On 3/6/26 18:46, Fabiano Rosas wrote:
>> Alexandr Moshkov<dtalexundeer@yandex-team.ru> writes:
>>
>>> When loading a subset, its name is checked for the parent prefix. The
>>> following bug may occur here:
>>>
>>> Let's say there is a vmstate named "virtio-blk", it has a subsection
>>> named "virtio-blk/subsection", and it also has another vmstate named
>>> "virtio" in the fields.
>>> Then, during the migration, when trying to load this subsection for
>>> "virtio", the prefix condition will pass for "virtio-blk/subsection" and
>>> then the migration will break, because this vmstate does not have such a
>>> subsection.
>>>
>>> In other words, if a field inside vmstate1 is set via vmstate2 with a
>>> name that is a prefix of the parent vmstate, then the field can "steal"
>>> a subsection belonging to the parent state.
>>>
>>> Looks like it happens because migration stream for "virtio-blk" looks
>>> like this:
>>>
>>>   [virtio-blk header] [virtio-blk fields] [virtio-blk subsections]
>>>
>>> "virtio-blk" contains "virtio" field, so migration stream is:
>>>
>>>   [virtio-blk header] [virtio header] [virtio fields] [virtio
>>> subsections] [virtio-blk subsections]
>>>
>>> And when we load the subsections of the "virtio" device,
>>> vmstate_subsection_load() uses qemu_peek_byte() to try to figure out if
>>> this is his subsection. This is where we encounter an error.
>>>
>>> Thus, the error occurs due to the fact that vmsd does not know how many
>>> subsections it has when loading (this does not appear anywhere in the
>>> migration stream), so it tries to load all the appropriate ones by
>>> names.
>>>
>>> Fix it by checking `/` at the end of idstr.
>>>
>>> Signed-off-by: Alexandr Moshkov<dtalexundeer@yandex-team.ru>
>>> ---
>>>   migration/vmstate.c | 7 +++++--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/migration/vmstate.c b/migration/vmstate.c
>>> index 4d28364f7b..187f3861f2 100644
>>> --- a/migration/vmstate.c
>>> +++ b/migration/vmstate.c
>>> @@ -613,7 +613,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>>>   
>>>       while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
>>>           char idstr[256], *idstr_ret;
>>> -        int ret;
>>> +        int ret, vmsd_name_len;
>>>           uint8_t version_id, len, size;
>>>           const VMStateDescription *sub_vmsd;
>>>   
>>> @@ -631,7 +631,10 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
>>>           memcpy(idstr, idstr_ret, size);
>>>           idstr[size] = 0;
>>>   
>>> -        if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
>>> +        vmsd_name_len = strlen(vmsd->name);
>>> +        if (strncmp(vmsd->name, idstr, vmsd_name_len) != 0 ||
>>> +            /* to avoid taking parent subsection here */
>>> +            idstr[vmsd_name_len] != '/') {
>>>               trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(prefix)");
>>>               /* it doesn't have a valid subsection name */
>>>               return 0;
>> Hi, this breaks migration for s390x and ppc64:
>>
>> qemu-system-s390x: Missing section footer for qemu-s390-flic
>
> Thanks for reply! It happens because "qemu-s390-flic" vmsd has 
> "qemu-s390-flic-full" subsection. I'm not sure if we can change the 
> names of the existing subsections. Peter, what do you think?
>
> It seemsto be a biggerproblemthatthe 
> migrationframeworkdoesnotdocumentthe namesforsubsections in any 
> way.Atthe same time, the codeimpliesthat atleastthe namesshould bea 
> substringwithitsparent. There is even such a code comment `subsection 
> name has to be "section_name/a"`, that seemsto implythe presenceof /.
>
Also I tried to put together a list of devices whose separator in 
subsections is not /:

pflash_cfi01
or-irq
vga
ide bmdma (this one is interesting because using both `_` and `/` as 
separator
pckbd
apic
arm_gic
qemu-s390-flic
iotkit_secctl
spapr_iommu
spapr
arm.cortex-a9-global-timer
vhost-user-fs

They using `_`, `-` or `.` as separator

[-- Attachment #2: Type: text/html, Size: 9475 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-10 13:27     ` Alexandr Moshkov
@ 2026-03-10 13:39       ` Peter Maydell
  2026-03-10 18:00         ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2026-03-10 13:39 UTC (permalink / raw)
  To: Alexandr Moshkov
  Cc: Peter Xu, qemu-devel, yc-core@yandex-team.ru, Fabiano Rosas

On Tue, 10 Mar 2026 at 13:28, Alexandr Moshkov
<dtalexundeer@yandex-team.ru> wrote:
> Hi, this breaks migration for s390x and ppc64:
>
> qemu-system-s390x: Missing section footer for qemu-s390-flic
>
> Thanks for reply! It happens because "qemu-s390-flic" vmsd has "qemu-s390-flic-full" subsection. I'm not sure if we can change the names of the existing subsections. Peter, what do you think?
>
> It seems to be a bigger problem that the migration framework does not document the names for subsections in any way. At the same time, the code implies that at least the names should be a substring with its parent. There is even such a code comment `subsection name has to be "section_name/a"`, that seems to imply the presence of /.
>
> Also I tried to put together a list of devices whose separator in subsections is not /:

> or-irq

This one and probably some of the others are my mistake, I think.
I think this is because:
 * the migration code imposes a constraint on the subsection names
 * the migration documentation does not mention this constraint
 * the migration code does not effectively enforce this constraint
   (e.g. by asserting when the vmstate with a bad name is registered)

My assumption when writing that code was very likely that the
name of the subsection didn't have to have any relation to
the name of its parent subsection (after all, the migration
code knows it is a subsection, so if it needed to have the
name on the wire be "parentname/subsectionname" it could construct
that itself), and since it all just worked I never noticed the
mistake...

-- PMM


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-10 13:39       ` Peter Maydell
@ 2026-03-10 18:00         ` Peter Xu
  2026-03-11  6:53           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2026-03-10 18:00 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexandr Moshkov, qemu-devel, yc-core@yandex-team.ru,
	Fabiano Rosas

On Tue, Mar 10, 2026 at 01:39:04PM +0000, Peter Maydell wrote:
> On Tue, 10 Mar 2026 at 13:28, Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
> > Hi, this breaks migration for s390x and ppc64:
> >
> > qemu-system-s390x: Missing section footer for qemu-s390-flic
> >
> > Thanks for reply! It happens because "qemu-s390-flic" vmsd has "qemu-s390-flic-full" subsection. I'm not sure if we can change the names of the existing subsections. Peter, what do you think?
> >
> > It seems to be a bigger problem that the migration framework does not document the names for subsections in any way. At the same time, the code implies that at least the names should be a substring with its parent. There is even such a code comment `subsection name has to be "section_name/a"`, that seems to imply the presence of /.
> >
> > Also I tried to put together a list of devices whose separator in subsections is not /:
> 
> > or-irq
> 
> This one and probably some of the others are my mistake, I think.
> I think this is because:
>  * the migration code imposes a constraint on the subsection names
>  * the migration documentation does not mention this constraint
>  * the migration code does not effectively enforce this constraint
>    (e.g. by asserting when the vmstate with a bad name is registered)
> 
> My assumption when writing that code was very likely that the
> name of the subsection didn't have to have any relation to
> the name of its parent subsection (after all, the migration
> code knows it is a subsection, so if it needed to have the
> name on the wire be "parentname/subsectionname" it could construct
> that itself), and since it all just worked I never noticed the
> mistake...

Indeed, we can do better on the 2nd/3rd bullets.. so we should document it
in struct VMStateDescription definition, and enforce it when registering
new VMSDs.

One thing to double check with Alexandr on this one:

> They using `_`, `-` or `.` as separator

Does it mean we also can't simply whitelist all these characters (including
"/"), because it won't always work?

For example in your case, it's "virtio-blk" being the parent VMSD, "virtio"
being the child VMSD, followed with a subsection belongs to virtio-blk.
We want to make that subsection be recognizable as virtio-blk's.

Then if we treat "-" also to be a separator, then "virtio" will still
mis-recognize virtio-blk's as its own (because its name is "virtio-blk/..."
hence it also satisfies the "virtio" check)?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-10 18:00         ` Peter Xu
@ 2026-03-11  6:53           ` Vladimir Sementsov-Ogievskiy
  2026-03-11 15:05             ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-03-11  6:53 UTC (permalink / raw)
  To: Peter Xu, Peter Maydell
  Cc: Alexandr Moshkov, qemu-devel, yc-core@yandex-team.ru,
	Fabiano Rosas

On 10.03.26 21:00, Peter Xu wrote:
> On Tue, Mar 10, 2026 at 01:39:04PM +0000, Peter Maydell wrote:
>> On Tue, 10 Mar 2026 at 13:28, Alexandr Moshkov
>> <dtalexundeer@yandex-team.ru> wrote:
>>> Hi, this breaks migration for s390x and ppc64:
>>>
>>> qemu-system-s390x: Missing section footer for qemu-s390-flic
>>>
>>> Thanks for reply! It happens because "qemu-s390-flic" vmsd has "qemu-s390-flic-full" subsection. I'm not sure if we can change the names of the existing subsections. Peter, what do you think?
>>>
>>> It seems to be a bigger problem that the migration framework does not document the names for subsections in any way. At the same time, the code implies that at least the names should be a substring with its parent. There is even such a code comment `subsection name has to be "section_name/a"`, that seems to imply the presence of /.
>>>
>>> Also I tried to put together a list of devices whose separator in subsections is not /:
>>
>>> or-irq
>>
>> This one and probably some of the others are my mistake, I think.
>> I think this is because:
>>   * the migration code imposes a constraint on the subsection names
>>   * the migration documentation does not mention this constraint
>>   * the migration code does not effectively enforce this constraint
>>     (e.g. by asserting when the vmstate with a bad name is registered)
>>
>> My assumption when writing that code was very likely that the
>> name of the subsection didn't have to have any relation to
>> the name of its parent subsection (after all, the migration
>> code knows it is a subsection, so if it needed to have the
>> name on the wire be "parentname/subsectionname" it could construct
>> that itself), and since it all just worked I never noticed the
>> mistake...
> 
> Indeed, we can do better on the 2nd/3rd bullets.. so we should document it
> in struct VMStateDescription definition, and enforce it when registering
> new VMSDs.
> 
> One thing to double check with Alexandr on this one:
> 
>> They using `_`, `-` or `.` as separator
> 
> Does it mean we also can't simply whitelist all these characters (including
> "/"), because it won't always work?
> 
> For example in your case, it's "virtio-blk" being the parent VMSD, "virtio"
> being the child VMSD, followed with a subsection belongs to virtio-blk.
> We want to make that subsection be recognizable as virtio-blk's.
> 
> Then if we treat "-" also to be a separator, then "virtio" will still
> mis-recognize virtio-blk's as its own (because its name is "virtio-blk/..."
> hence it also satisfies the "virtio" check)?
> 

I have an idea. What if we simply change the order, instead of

1. check prefix (with '/' at the end). and stop the loop if it doesn't match
2. call vmstate_get_subsection(vmsd->subsections, idstr)

do

0. Document, and check, that starting from 11.0 any new subsections must follow <section name>/<subsection name> notation

in vmstate_subsection_load():

1. call vmstate_get_subsection, if it succeeded we are done
2. check prefix with '/' at the end. If match - it's a new subsection, added after 11.0 point, in correct notation. If does not match - it's subsection of another state, stop the loop.


Possible problems:

1. may be some subsections (with wrong notation), which were added before 11.0, but then removed (we break migration from old version to new).

This can be checked by git-grepping for subsections through qemu releases.

2. may be some downstream subsections (we may break migration from some downstream versions to upstream).

Should we care of it? Probably not.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-11  6:53           ` Vladimir Sementsov-Ogievskiy
@ 2026-03-11 15:05             ` Peter Xu
  2026-03-12  6:30               ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2026-03-11 15:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On Wed, Mar 11, 2026 at 09:53:07AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 10.03.26 21:00, Peter Xu wrote:
> > On Tue, Mar 10, 2026 at 01:39:04PM +0000, Peter Maydell wrote:
> > > On Tue, 10 Mar 2026 at 13:28, Alexandr Moshkov
> > > <dtalexundeer@yandex-team.ru> wrote:
> > > > Hi, this breaks migration for s390x and ppc64:
> > > > 
> > > > qemu-system-s390x: Missing section footer for qemu-s390-flic
> > > > 
> > > > Thanks for reply! It happens because "qemu-s390-flic" vmsd has "qemu-s390-flic-full" subsection. I'm not sure if we can change the names of the existing subsections. Peter, what do you think?
> > > > 
> > > > It seems to be a bigger problem that the migration framework does not document the names for subsections in any way. At the same time, the code implies that at least the names should be a substring with its parent. There is even such a code comment `subsection name has to be "section_name/a"`, that seems to imply the presence of /.
> > > > 
> > > > Also I tried to put together a list of devices whose separator in subsections is not /:
> > > 
> > > > or-irq
> > > 
> > > This one and probably some of the others are my mistake, I think.
> > > I think this is because:
> > >   * the migration code imposes a constraint on the subsection names
> > >   * the migration documentation does not mention this constraint
> > >   * the migration code does not effectively enforce this constraint
> > >     (e.g. by asserting when the vmstate with a bad name is registered)
> > > 
> > > My assumption when writing that code was very likely that the
> > > name of the subsection didn't have to have any relation to
> > > the name of its parent subsection (after all, the migration
> > > code knows it is a subsection, so if it needed to have the
> > > name on the wire be "parentname/subsectionname" it could construct
> > > that itself), and since it all just worked I never noticed the
> > > mistake...
> > 
> > Indeed, we can do better on the 2nd/3rd bullets.. so we should document it
> > in struct VMStateDescription definition, and enforce it when registering
> > new VMSDs.
> > 
> > One thing to double check with Alexandr on this one:
> > 
> > > They using `_`, `-` or `.` as separator
> > 
> > Does it mean we also can't simply whitelist all these characters (including
> > "/"), because it won't always work?
> > 
> > For example in your case, it's "virtio-blk" being the parent VMSD, "virtio"
> > being the child VMSD, followed with a subsection belongs to virtio-blk.
> > We want to make that subsection be recognizable as virtio-blk's.
> > 
> > Then if we treat "-" also to be a separator, then "virtio" will still
> > mis-recognize virtio-blk's as its own (because its name is "virtio-blk/..."
> > hence it also satisfies the "virtio" check)?
> > 
> 
> I have an idea. What if we simply change the order, instead of
> 
> 1. check prefix (with '/' at the end). and stop the loop if it doesn't match
> 2. call vmstate_get_subsection(vmsd->subsections, idstr)
> 
> do
> 
> 0. Document, and check, that starting from 11.0 any new subsections must follow <section name>/<subsection name> notation
> 
> in vmstate_subsection_load():
> 
> 1. call vmstate_get_subsection, if it succeeded we are done
> 2. check prefix with '/' at the end. If match - it's a new subsection, added after 11.0 point, in correct notation. If does not match - it's subsection of another state, stop the loop.

Yep, this sounds working.

But I wonder do we need (2) at all?  Say, can we sololy rely on
vmstate_get_subsection() to identify if we should take it or just leave it
for upper layers?  So we will never fail a subsection lookup, only either
(1) load it if owned, or (2) ignore it and pop the VMSD stack.

It means in code we don't restrict subsection name to be "$PARENT_NAME/*".

However we should likely still suggest that in doc of subsections, so we at
least avoid one parent VMSD having subsection S1, then its field having
subsection S2, and by accident S1 and S2 has the same name..

> 
> 
> Possible problems:
> 
> 1. may be some subsections (with wrong notation), which were added before 11.0, but then removed (we break migration from old version to new).

If a subsection with wrong notation got removed, then it should require a
machine compat property otherwise migration should be broken anyway?

> 
> This can be checked by git-grepping for subsections through qemu releases.
> 
> 2. may be some downstream subsections (we may break migration from some downstream versions to upstream).
> 
> Should we care of it? Probably not.

Not sure if this is an issue if we drop (2) directly above.

There seems to have a 3rd possible "issue": the failure will be more
obscure on a subsection that nobody recognizes; we used to try our best say
"VM subsection '...' in '...' does not exist", but then it'll always pop
the VMSD stack then the upper layer may read the SUBSECTION byte anywhere.
So debugging a stream with an extra subsection can be slightly more
awkward, but I'm not sure how much.  Meanwhile for a compatible / normal
migration it'll be all fine.  It just means when seeing a subsection will
try to pick it up from the inner VMSD and fallback to outter VMSD every
time.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-11 15:05             ` Peter Xu
@ 2026-03-12  6:30               ` Vladimir Sementsov-Ogievskiy
  2026-03-12 15:34                 ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-03-12  6:30 UTC (permalink / raw)
  To: Peter Xu
  Cc: Peter Maydell, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On 11.03.26 18:05, Peter Xu wrote:
> On Wed, Mar 11, 2026 at 09:53:07AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 10.03.26 21:00, Peter Xu wrote:
>>> On Tue, Mar 10, 2026 at 01:39:04PM +0000, Peter Maydell wrote:
>>>> On Tue, 10 Mar 2026 at 13:28, Alexandr Moshkov
>>>> <dtalexundeer@yandex-team.ru> wrote:
>>>>> Hi, this breaks migration for s390x and ppc64:
>>>>>
>>>>> qemu-system-s390x: Missing section footer for qemu-s390-flic
>>>>>
>>>>> Thanks for reply! It happens because "qemu-s390-flic" vmsd has "qemu-s390-flic-full" subsection. I'm not sure if we can change the names of the existing subsections. Peter, what do you think?
>>>>>
>>>>> It seems to be a bigger problem that the migration framework does not document the names for subsections in any way. At the same time, the code implies that at least the names should be a substring with its parent. There is even such a code comment `subsection name has to be "section_name/a"`, that seems to imply the presence of /.
>>>>>
>>>>> Also I tried to put together a list of devices whose separator in subsections is not /:
>>>>
>>>>> or-irq
>>>>
>>>> This one and probably some of the others are my mistake, I think.
>>>> I think this is because:
>>>>    * the migration code imposes a constraint on the subsection names
>>>>    * the migration documentation does not mention this constraint
>>>>    * the migration code does not effectively enforce this constraint
>>>>      (e.g. by asserting when the vmstate with a bad name is registered)
>>>>
>>>> My assumption when writing that code was very likely that the
>>>> name of the subsection didn't have to have any relation to
>>>> the name of its parent subsection (after all, the migration
>>>> code knows it is a subsection, so if it needed to have the
>>>> name on the wire be "parentname/subsectionname" it could construct
>>>> that itself), and since it all just worked I never noticed the
>>>> mistake...
>>>
>>> Indeed, we can do better on the 2nd/3rd bullets.. so we should document it
>>> in struct VMStateDescription definition, and enforce it when registering
>>> new VMSDs.
>>>
>>> One thing to double check with Alexandr on this one:
>>>
>>>> They using `_`, `-` or `.` as separator
>>>
>>> Does it mean we also can't simply whitelist all these characters (including
>>> "/"), because it won't always work?
>>>
>>> For example in your case, it's "virtio-blk" being the parent VMSD, "virtio"
>>> being the child VMSD, followed with a subsection belongs to virtio-blk.
>>> We want to make that subsection be recognizable as virtio-blk's.
>>>
>>> Then if we treat "-" also to be a separator, then "virtio" will still
>>> mis-recognize virtio-blk's as its own (because its name is "virtio-blk/..."
>>> hence it also satisfies the "virtio" check)?
>>>
>>
>> I have an idea. What if we simply change the order, instead of
>>
>> 1. check prefix (with '/' at the end). and stop the loop if it doesn't match
>> 2. call vmstate_get_subsection(vmsd->subsections, idstr)
>>
>> do
>>
>> 0. Document, and check, that starting from 11.0 any new subsections must follow <section name>/<subsection name> notation
>>
>> in vmstate_subsection_load():
>>
>> 1. call vmstate_get_subsection, if it succeeded we are done
>> 2. check prefix with '/' at the end. If match - it's a new subsection, added after 11.0 point, in correct notation. If does not match - it's subsection of another state, stop the loop.
> 
> Yep, this sounds working.
> 
> But I wonder do we need (2) at all?  Say, can we sololy rely on
> vmstate_get_subsection() to identify if we should take it or just leave it
> for upper layers?  So we will never fail a subsection lookup, only either
> (1) load it if owned, or (2) ignore it and pop the VMSD stack.
> 
> It means in code we don't restrict subsection name to be "$PARENT_NAME/*".
> 
> However we should likely still suggest that in doc of subsections, so we at
> least avoid one parent VMSD having subsection S1, then its field having
> subsection S2, and by accident S1 and S2 has the same name..
> 
>>
>>
>> Possible problems:
>>
>> 1. may be some subsections (with wrong notation), which were added before 11.0, but then removed (we break migration from old version to new).
> 
> If a subsection with wrong notation got removed, then it should require a
> machine compat property otherwise migration should be broken anyway?
> 
>>
>> This can be checked by git-grepping for subsections through qemu releases.
>>
>> 2. may be some downstream subsections (we may break migration from some downstream versions to upstream).
>>
>> Should we care of it? Probably not.
> 
> Not sure if this is an issue if we drop (2) directly above.
> 
> There seems to have a 3rd possible "issue": the failure will be more
> obscure on a subsection that nobody recognizes; we used to try our best say
> "VM subsection '...' in '...' does not exist", but then it'll always pop
> the VMSD stack then the upper layer may read the SUBSECTION byte anywhere.
> So debugging a stream with an extra subsection can be slightly more
> awkward, but I'm not sure how much.  Meanwhile for a compatible / normal
> migration it'll be all fine.  It just means when seeing a subsection will
> try to pick it up from the inner VMSD and fallback to outter VMSD every
> time.
> 

Understand now. All this time I somehow thought, that we check this prefix to
be able to skip the subsection, if we are sure that it belong to current section
but we don't support it. But that's obviously nonsense, we can't skip, and
the code just return an error if vmstate_get_subsection() fails.

Now I see that prefix check is only to generate more accurate error message.

Yes, I agree that more correct is to drop [2] at all and simply return 0
if vmstate_get_subsection() fails, that will work for any correct migration.

---

You say

> .. always pop the VMSD stack then the upper layer may read the SUBSECTION byte anywhere.

But it actually can be anything. Of course probability is small that some another state
looks like QEMU_VM_SUBSECTION + <existing subsection name>, but we don't have actual
guarantee for it in the protocol.

Probably, we may also add a protocol change (with some global state property,
set to true in new MT) to add a number of subsections into section state, to be
sure how many of them we have in the stream.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-12  6:30               ` Vladimir Sementsov-Ogievskiy
@ 2026-03-12 15:34                 ` Peter Xu
  2026-03-12 16:34                   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2026-03-12 15:34 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On Thu, Mar 12, 2026 at 09:30:35AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> You say
> 
> > .. always pop the VMSD stack then the upper layer may read the SUBSECTION byte anywhere.
> 
> But it actually can be anything. Of course probability is small that some another state
> looks like QEMU_VM_SUBSECTION + <existing subsection name>, but we don't have actual
> guarantee for it in the protocol.
> 
> Probably, we may also add a protocol change (with some global state property,
> set to true in new MT) to add a number of subsections into section state, to be
> sure how many of them we have in the stream.

Yes, I believe it's always possible to change the streaming protocol to
make it even clearer.  Actually, that's what I was picturing before I read
your previous reply.

For doing that, I wonder if we should just fix "this specific problem" or
"the bigger problem".

The bigger problem here, at least to me, is migration streaming lacks level
information - virtio can wrongly treat that piece of info as its own
subsection only because the stream doesn't tell which level it's in..
It's a common issue for migration stream, so I wonder if we need to break
the streaming protocol then whether we should do even further than that.

That is the part where I think what you suggested in the previous email may
be the easy way to go (plus, removing the name check completely).

But let me double check with you on one thing: before any fix, this problem
will happen even during a migration between two latest QEMU that supports
that new subsection of virtio-blk, am I right?

I am curious why this problem is only reported until today, rather than
when the new subsection was developed.  I wonder if I still miss something
that I didn't notice..

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-12 15:34                 ` Peter Xu
@ 2026-03-12 16:34                   ` Vladimir Sementsov-Ogievskiy
  2026-03-12 18:13                     ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-03-12 16:34 UTC (permalink / raw)
  To: Peter Xu
  Cc: Peter Maydell, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On 12.03.26 18:34, Peter Xu wrote:
> On Thu, Mar 12, 2026 at 09:30:35AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> You say
>>
>>> .. always pop the VMSD stack then the upper layer may read the SUBSECTION byte anywhere.
>>
>> But it actually can be anything. Of course probability is small that some another state
>> looks like QEMU_VM_SUBSECTION + <existing subsection name>, but we don't have actual
>> guarantee for it in the protocol.
>>
>> Probably, we may also add a protocol change (with some global state property,
>> set to true in new MT) to add a number of subsections into section state, to be
>> sure how many of them we have in the stream.
> 
> Yes, I believe it's always possible to change the streaming protocol to
> make it even clearer.  Actually, that's what I was picturing before I read
> your previous reply.
> 
> For doing that, I wonder if we should just fix "this specific problem" or
> "the bigger problem".
> 
> The bigger problem here, at least to me, is migration streaming lacks level
> information - virtio can wrongly treat that piece of info as its own
> subsection only because the stream doesn't tell which level it's in..
> It's a common issue for migration stream, so I wonder if we need to break
> the streaming protocol then whether we should do even further than that.
> 
> That is the part where I think what you suggested in the previous email may
> be the easy way to go (plus, removing the name check completely).
> 
> But let me double check with you on one thing: before any fix, this problem
> will happen even during a migration between two latest QEMU that supports
> that new subsection of virtio-blk, am I right?
> 
> I am curious why this problem is only reported until today, rather than
> when the new subsection was developed.  I wonder if I still miss something
> that I didn't notice..
> 

Hmm. Probably, that this is the subsection, that exist only in our downstream.
I don't know, was it mentioned clearly before, sorry if not.
This subsection just not exist in master, and we are not going to upstream it.
So, for upstream it's a "theoretical" bug in a protocol, which may be triggered in
some future moment. And that's why there is not specific case in commit message
but only assumption "Let's say there is a vmstate ...".

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-12 16:34                   ` Vladimir Sementsov-Ogievskiy
@ 2026-03-12 18:13                     ` Peter Xu
  2026-03-12 18:27                       ` Peter Maydell
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2026-03-12 18:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On Thu, Mar 12, 2026 at 07:34:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hmm. Probably, that this is the subsection, that exist only in our downstream.
> I don't know, was it mentioned clearly before, sorry if not.

Ah.. Looks like it was not mentioned in the latest version posted:

https://lore.kernel.org/all/20260312102626.891359-1-dtalexundeer@yandex-team.ru/

Let's always mention this, because it's important piece of info, e.g. to
know upstream has yet no reproducer.

> This subsection just not exist in master, and we are not going to upstream it.
> So, for upstream it's a "theoretical" bug in a protocol, which may be triggered in
> some future moment. And that's why there is not specific case in commit message
> but only assumption "Let's say there is a vmstate ...".

In this case, for upstream we almost only lose but yet unknown gain with
this fix then, because we stop throwing useful info for unknown new
subsections, as PeterM pointed out:

https://lore.kernel.org/all/CAFEAcA_mzXDdz_xjPnZ9Kc4K01Aoss_OH_qEoL3G33KdsnHrLw@mail.gmail.com/

But I agree it is a potential issue, if we cannot justify QEMU can never
hit it.

Sending "how many subsections are there" is one viable option as you said,
but it only solves this small problem, and knowing that number requires src
QEMU looping over the subsections twice so a bit awkward (first time we
need to kickoff ".needed()" hooks to do the counting; not all will be sent).

The other way is to provide level information somehow in the stream.

Say, we could attach START/END markers for each VMSD to be dumped.  That
may be able to help in other cases too, e.g. when we accidentally grow some
VMSD fields breaking migration, then IIUC dest QEMU can provide a more
accurate error message when it knows the bound of current VMSD that is
being loaded.

But I'm also not sure if this is worth it yet to fix an imaginary problem.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-12 18:13                     ` Peter Xu
@ 2026-03-12 18:27                       ` Peter Maydell
  2026-03-12 19:25                         ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2026-03-12 18:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: Vladimir Sementsov-Ogievskiy, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On Thu, 12 Mar 2026 at 18:13, Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, Mar 12, 2026 at 07:34:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > Hmm. Probably, that this is the subsection, that exist only in our downstream.
> > I don't know, was it mentioned clearly before, sorry if not.
>
> Ah.. Looks like it was not mentioned in the latest version posted:
>
> https://lore.kernel.org/all/20260312102626.891359-1-dtalexundeer@yandex-team.ru/
>
> Let's always mention this, because it's important piece of info, e.g. to
> know upstream has yet no reproducer.
>
> > This subsection just not exist in master, and we are not going to upstream it.
> > So, for upstream it's a "theoretical" bug in a protocol, which may be triggered in
> > some future moment. And that's why there is not specific case in commit message
> > but only assumption "Let's say there is a vmstate ...".
>
> In this case, for upstream we almost only lose but yet unknown gain with
> this fix then, because we stop throwing useful info for unknown new
> subsections, as PeterM pointed out:
>
> https://lore.kernel.org/all/CAFEAcA_mzXDdz_xjPnZ9Kc4K01Aoss_OH_qEoL3G33KdsnHrLw@mail.gmail.com/
>
> But I agree it is a potential issue, if we cannot justify QEMU can never
> hit it.
>
> Sending "how many subsections are there" is one viable option as you said,
> but it only solves this small problem, and knowing that number requires src
> QEMU looping over the subsections twice so a bit awkward (first time we
> need to kickoff ".needed()" hooks to do the counting; not all will be sent).
>
> The other way is to provide level information somehow in the stream.
>
> Say, we could attach START/END markers for each VMSD to be dumped.  That
> may be able to help in other cases too, e.g. when we accidentally grow some
> VMSD fields breaking migration, then IIUC dest QEMU can provide a more
> accurate error message when it knows the bound of current VMSD that is
> being loaded.

Could we perhaps do something like:
 * mark up the vmstate sections we have that don't follow the
   "/ is the separator" naming rule with some new struct field
   that says "legacy_bad_subsection_name = true" or whatever
   (or rename them if there are cases where we're OK with a
   migration break, like "iotkit-secctl")
 * have the migration code enforce the "subsection name follows the
   / separator rule" via an assert when the vmstate is registered,
   with a loosening for the ones marked as legacy names
 * have the inbound migration code use the strong check on the
   separator to distinguish "subsection" from "not subsection"
   unless the vmstate it's doing inbound migration for is marked
   as needing legacy name handling
?

I think that would keep migration compatibility, avoid the problem
of wrong subsection names creeping in in future, and in practice
mean we don't in future hit this issue upstream. (With more effort
it might be also possible to tie the legacy names to QEMU versioned
machines so that they can eventually be retired.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] vmstate: fix subsection load name check
  2026-03-12 18:27                       ` Peter Maydell
@ 2026-03-12 19:25                         ` Peter Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2026-03-12 19:25 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Alexandr Moshkov, qemu-devel,
	yc-core@yandex-team.ru, Fabiano Rosas

On Thu, Mar 12, 2026 at 06:27:05PM +0000, Peter Maydell wrote:
> On Thu, 12 Mar 2026 at 18:13, Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Mar 12, 2026 at 07:34:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > Hmm. Probably, that this is the subsection, that exist only in our downstream.
> > > I don't know, was it mentioned clearly before, sorry if not.
> >
> > Ah.. Looks like it was not mentioned in the latest version posted:
> >
> > https://lore.kernel.org/all/20260312102626.891359-1-dtalexundeer@yandex-team.ru/
> >
> > Let's always mention this, because it's important piece of info, e.g. to
> > know upstream has yet no reproducer.
> >
> > > This subsection just not exist in master, and we are not going to upstream it.
> > > So, for upstream it's a "theoretical" bug in a protocol, which may be triggered in
> > > some future moment. And that's why there is not specific case in commit message
> > > but only assumption "Let's say there is a vmstate ...".
> >
> > In this case, for upstream we almost only lose but yet unknown gain with
> > this fix then, because we stop throwing useful info for unknown new
> > subsections, as PeterM pointed out:
> >
> > https://lore.kernel.org/all/CAFEAcA_mzXDdz_xjPnZ9Kc4K01Aoss_OH_qEoL3G33KdsnHrLw@mail.gmail.com/
> >
> > But I agree it is a potential issue, if we cannot justify QEMU can never
> > hit it.
> >
> > Sending "how many subsections are there" is one viable option as you said,
> > but it only solves this small problem, and knowing that number requires src
> > QEMU looping over the subsections twice so a bit awkward (first time we
> > need to kickoff ".needed()" hooks to do the counting; not all will be sent).
> >
> > The other way is to provide level information somehow in the stream.
> >
> > Say, we could attach START/END markers for each VMSD to be dumped.  That
> > may be able to help in other cases too, e.g. when we accidentally grow some
> > VMSD fields breaking migration, then IIUC dest QEMU can provide a more
> > accurate error message when it knows the bound of current VMSD that is
> > being loaded.
> 
> Could we perhaps do something like:
>  * mark up the vmstate sections we have that don't follow the
>    "/ is the separator" naming rule with some new struct field
>    that says "legacy_bad_subsection_name = true" or whatever
>    (or rename them if there are cases where we're OK with a
>    migration break, like "iotkit-secctl")
>  * have the migration code enforce the "subsection name follows the
>    / separator rule" via an assert when the vmstate is registered,
>    with a loosening for the ones marked as legacy names
>  * have the inbound migration code use the strong check on the
>    separator to distinguish "subsection" from "not subsection"
>    unless the vmstate it's doing inbound migration for is marked
>    as needing legacy name handling
> ?
> 
> I think that would keep migration compatibility, avoid the problem
> of wrong subsection names creeping in in future, and in practice
> mean we don't in future hit this issue upstream. (With more effort
> it might be also possible to tie the legacy names to QEMU versioned
> machines so that they can eventually be retired.)

Yes this is an option.  This will cause a few other things to maintain for
us, though..

- legacy_bad_subsection_name itself

- two different paths of parsing subsections

- if we want to get rid of this (i think we should..), we need the machine
  compat property for legacy_bad_subsection_name to get rid of it after 6
  years

We'd better also make sure we don't miss something when choosing the new
legacy_bad_subsection_name candidates, or it'll start to fail migrations
with a more strict check.

That so far just sounds slightly more work and less benefit comparing to
introducing START/END markers for VMSDs to me.  With START/END markers for
each VMSD (or something like it), we can actually drop the requirement on
"sub-vmsd names to contain parents' vmsd names", because we have the level
info, then it's clear which subsection belongs to which parent vmsd.

I wished we have that already; we almost have it (QEMU_VM_SECTION_FULL,
QEMU_VM_SECTION_FOOTER, etc.) we're very close except it's just that it
only works for a whole section hence only top VMSD, rather than every one
of them..

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-03-12 19:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-02  7:06 [PATCH v2] vmstate: fix subsection load name check Alexandr Moshkov
2026-03-02 17:36 ` Peter Xu
2026-03-06 13:46 ` Fabiano Rosas
2026-03-10  5:32   ` Alexandr Moshkov
2026-03-10 13:27     ` Alexandr Moshkov
2026-03-10 13:39       ` Peter Maydell
2026-03-10 18:00         ` Peter Xu
2026-03-11  6:53           ` Vladimir Sementsov-Ogievskiy
2026-03-11 15:05             ` Peter Xu
2026-03-12  6:30               ` Vladimir Sementsov-Ogievskiy
2026-03-12 15:34                 ` Peter Xu
2026-03-12 16:34                   ` Vladimir Sementsov-Ogievskiy
2026-03-12 18:13                     ` Peter Xu
2026-03-12 18:27                       ` Peter Maydell
2026-03-12 19:25                         ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox