* [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" @ 2024-05-07 11:19 marcandre.lureau 2024-05-07 11:19 ` [PATCH 1/4] migration: add "exists" info to load-state-field trace marcandre.lureau ` (5 more replies) 0 siblings, 6 replies; 16+ messages in thread From: marcandre.lureau @ 2024-05-07 11:19 UTC (permalink / raw) To: qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, =?unknown-8bit?q?Marc-Andr=C3=A9?= Lureau From: Marc-André Lureau <marcandre.lureau@redhat.com> Hi, The aforementioned patch breaks virtio-gpu device migrations for versions pre-9.0/9.0, both forwards and backwards. Versioning of `VMS_STRUCT` is more complex than it may initially appear, as evidenced in the problematic commit dfcf74fa68c ("virtio-gpu: fix scanout migration post-load"). To resolve this, we need to propagate the `vmstate` `version_id` through the nested structures. Additionally, we should tie specific machine version to a corresponding `version_id` to maintain migration compatibility. `VMS_VSTRUCT` allows specifying the appropriate version of the nested structure to use. Marc-André Lureau (4): migration: add "exists" info to load-state-field trace include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 virtio-gpu: use a VMState variant for the scanout field virtio-gpu: add x-vmstate-version include/hw/virtio/virtio-gpu.h | 1 + include/migration/vmstate.h | 12 ++++++++++++ hw/core/machine.c | 1 + hw/display/virtio-gpu.c | 28 +++++++++++++++++++++------- migration/vmstate.c | 5 +++-- migration/trace-events | 2 +- 6 files changed, 39 insertions(+), 10 deletions(-) -- 2.41.0.28.gd7d8841f67 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/4] migration: add "exists" info to load-state-field trace 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau @ 2024-05-07 11:19 ` marcandre.lureau 2024-05-07 15:29 ` Peter Xu 2024-05-07 11:19 ` [PATCH 2/4] include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 marcandre.lureau ` (4 subsequent siblings) 5 siblings, 1 reply; 16+ messages in thread From: marcandre.lureau @ 2024-05-07 11:19 UTC (permalink / raw) To: qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau From: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- migration/vmstate.c | 5 +++-- migration/trace-events | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/migration/vmstate.c b/migration/vmstate.c index ef26f26ccd..b51212a75b 100644 --- a/migration/vmstate.c +++ b/migration/vmstate.c @@ -128,8 +128,9 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd, } } while (field->name) { - trace_vmstate_load_state_field(vmsd->name, field->name); - if (vmstate_field_exists(vmsd, field, opaque, version_id)) { + bool exists = vmstate_field_exists(vmsd, field, opaque, version_id); + trace_vmstate_load_state_field(vmsd->name, field->name, exists); + if (exists) { void *first_elem = opaque + field->offset; int i, n_elems = vmstate_n_elems(opaque, field); int size = vmstate_size(opaque, field); diff --git a/migration/trace-events b/migration/trace-events index f0e1cb80c7..d327d29ca0 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -58,7 +58,7 @@ postcopy_page_req_sync(void *host_addr) "sync page req %p" vmstate_load_field_error(const char *field, int ret) "field \"%s\" load failed, ret = %d" vmstate_load_state(const char *name, int version_id) "%s v%d" vmstate_load_state_end(const char *name, const char *reason, int val) "%s %s/%d" -vmstate_load_state_field(const char *name, const char *field) "%s:%s" +vmstate_load_state_field(const char *name, const char *field, bool exists) "%s:%s exists=%d" vmstate_n_elems(const char *name, int n_elems) "%s: %d" vmstate_subsection_load(const char *parent) "%s" vmstate_subsection_load_bad(const char *parent, const char *sub, const char *sub2) "%s: %s/%s" -- 2.41.0.28.gd7d8841f67 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/4] migration: add "exists" info to load-state-field trace 2024-05-07 11:19 ` [PATCH 1/4] migration: add "exists" info to load-state-field trace marcandre.lureau @ 2024-05-07 15:29 ` Peter Xu 0 siblings, 0 replies; 16+ messages in thread From: Peter Xu @ 2024-05-07 15:29 UTC (permalink / raw) To: marcandre.lureau Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang On Tue, May 07, 2024 at 03:19:17PM +0400, marcandre.lureau@redhat.com wrote: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> -- Peter Xu ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/4] include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau 2024-05-07 11:19 ` [PATCH 1/4] migration: add "exists" info to load-state-field trace marcandre.lureau @ 2024-05-07 11:19 ` marcandre.lureau 2024-05-07 11:19 ` [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field marcandre.lureau ` (3 subsequent siblings) 5 siblings, 0 replies; 16+ messages in thread From: marcandre.lureau @ 2024-05-07 11:19 UTC (permalink / raw) To: qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau From: Marc-André Lureau <marcandre.lureau@redhat.com> Add a macro to declare a VMState field which contains an array of struct with a specific version. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- include/migration/vmstate.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 294d2d8486..eaff3c8ff2 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -634,6 +634,18 @@ extern const VMStateInfo vmstate_info_qlist; .offset = vmstate_offset_varray(_state, _field, _type), \ } +#define VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(_field, _state, _test, _field_num, _version, _vmsd, _type, _struct_version) { \ + .name = (stringify(_field)), \ + .version_id = (_version), \ + .struct_version_id = (_struct_version), \ + .field_exists = (_test), \ + .num_offset = vmstate_offset_value(_state, _field_num, uint32_t), \ + .vmsd = &(_vmsd), \ + .size = sizeof(_type), \ + .flags = VMS_VSTRUCT|VMS_VARRAY_UINT32, \ + .offset = vmstate_offset_varray(_state, _field, _type), \ +} + #define VMSTATE_STRUCT_VARRAY_ALLOC(_field, _state, _field_num, _version, _vmsd, _type) {\ .name = (stringify(_field)), \ .version_id = (_version), \ -- 2.41.0.28.gd7d8841f67 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau 2024-05-07 11:19 ` [PATCH 1/4] migration: add "exists" info to load-state-field trace marcandre.lureau 2024-05-07 11:19 ` [PATCH 2/4] include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 marcandre.lureau @ 2024-05-07 11:19 ` marcandre.lureau 2024-05-07 19:59 ` Peter Xu 2024-05-10 10:25 ` Michael S. Tsirkin 2024-05-07 11:19 ` [PATCH 4/4] virtio-gpu: add x-vmstate-version marcandre.lureau ` (2 subsequent siblings) 5 siblings, 2 replies; 16+ messages in thread From: marcandre.lureau @ 2024-05-07 11:19 UTC (permalink / raw) To: qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau From: Marc-André Lureau <marcandre.lureau@redhat.com> Depending on the version, use v1 or v2 of the scanout VM state. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- hw/display/virtio-gpu.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index ae831b6b3e..4fd72caf3f 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -1191,17 +1191,29 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { }, }; +static bool vmstate_before_v2(void *opaque, int version) +{ + return version <= 1; +} + static const VMStateDescription vmstate_virtio_gpu_scanouts = { .name = "virtio-gpu-scanouts", - .version_id = 1, + .version_id = 2, + .minimum_version_id = 1, .fields = (const VMStateField[]) { VMSTATE_INT32(parent_obj.enable, struct VirtIOGPU), VMSTATE_UINT32_EQUAL(parent_obj.conf.max_outputs, struct VirtIOGPU, NULL), - VMSTATE_STRUCT_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, - parent_obj.conf.max_outputs, 1, - vmstate_virtio_gpu_scanout, - struct virtio_gpu_scanout), + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, + vmstate_before_v2, + parent_obj.conf.max_outputs, 1, + vmstate_virtio_gpu_scanout, + struct virtio_gpu_scanout, 1), + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, + NULL, + parent_obj.conf.max_outputs, 2, + vmstate_virtio_gpu_scanout, + struct virtio_gpu_scanout, 2), VMSTATE_END_OF_LIST() }, }; -- 2.41.0.28.gd7d8841f67 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-07 11:19 ` [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field marcandre.lureau @ 2024-05-07 19:59 ` Peter Xu 2024-05-10 8:39 ` Marc-André Lureau 2024-05-10 10:25 ` Michael S. Tsirkin 1 sibling, 1 reply; 16+ messages in thread From: Peter Xu @ 2024-05-07 19:59 UTC (permalink / raw) To: marcandre.lureau Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang On Tue, May 07, 2024 at 03:19:19PM +0400, marcandre.lureau@redhat.com wrote: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Depending on the version, use v1 or v2 of the scanout VM state. > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > --- > hw/display/virtio-gpu.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > index ae831b6b3e..4fd72caf3f 100644 > --- a/hw/display/virtio-gpu.c > +++ b/hw/display/virtio-gpu.c > @@ -1191,17 +1191,29 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { > }, > }; > > +static bool vmstate_before_v2(void *opaque, int version) > +{ > + return version <= 1; > +} > + > static const VMStateDescription vmstate_virtio_gpu_scanouts = { > .name = "virtio-gpu-scanouts", > - .version_id = 1, > + .version_id = 2, > + .minimum_version_id = 1, > .fields = (const VMStateField[]) { > VMSTATE_INT32(parent_obj.enable, struct VirtIOGPU), > VMSTATE_UINT32_EQUAL(parent_obj.conf.max_outputs, > struct VirtIOGPU, NULL), > - VMSTATE_STRUCT_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > - parent_obj.conf.max_outputs, 1, > - vmstate_virtio_gpu_scanout, > - struct virtio_gpu_scanout), > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > + vmstate_before_v2, > + parent_obj.conf.max_outputs, 1, > + vmstate_virtio_gpu_scanout, > + struct virtio_gpu_scanout, 1), > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > + NULL, > + parent_obj.conf.max_outputs, 2, > + vmstate_virtio_gpu_scanout, > + struct virtio_gpu_scanout, 2), Personally I really wished struct_version_id never existed.. After these years we only have 1 user of it (hw/ipmi/isa_ipmi_kcs.c), and we need to keep that working. I'm wondering whether there's way we can avoid adding one more user, and even more complicated.. I think I get the reasoning behind "we define the same thing twice", but this is really tricky and definitely needs rich documents, meanwhiel all of these seem to rely on so many small details: one set .field_exists properly, one leaving it to NULL; also the two versionings used here for both parent vmsd, and the struct, and for each entry we need to set different versions to different fields.. Would it work if we only make the new fields under control with vmstate_before_v2()? IOW, making all below with .field_exists=vmstate_before_v2, so skip below when machine type is old? VMSTATE_UINT32_V(fb.format, struct virtio_gpu_scanout, 2), VMSTATE_UINT32_V(fb.bytes_pp, struct virtio_gpu_scanout, 2), VMSTATE_UINT32_V(fb.width, struct virtio_gpu_scanout, 2), VMSTATE_UINT32_V(fb.height, struct virtio_gpu_scanout, 2), VMSTATE_UINT32_V(fb.stride, struct virtio_gpu_scanout, 2), VMSTATE_UINT32_V(fb.offset, struct virtio_gpu_scanout, 2), Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-07 19:59 ` Peter Xu @ 2024-05-10 8:39 ` Marc-André Lureau 2024-05-10 17:33 ` Peter Xu 0 siblings, 1 reply; 16+ messages in thread From: Marc-André Lureau @ 2024-05-10 8:39 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang Hi On Wed, May 8, 2024 at 12:01 AM Peter Xu <peterx@redhat.com> wrote: > > On Tue, May 07, 2024 at 03:19:19PM +0400, marcandre.lureau@redhat.com wrote: > > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > > > Depending on the version, use v1 or v2 of the scanout VM state. > > > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > > --- > > hw/display/virtio-gpu.c | 22 +++++++++++++++++----- > > 1 file changed, 17 insertions(+), 5 deletions(-) > > > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > > index ae831b6b3e..4fd72caf3f 100644 > > --- a/hw/display/virtio-gpu.c > > +++ b/hw/display/virtio-gpu.c > > @@ -1191,17 +1191,29 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { > > }, > > }; > > > > +static bool vmstate_before_v2(void *opaque, int version) > > +{ > > + return version <= 1; > > +} > > + > > static const VMStateDescription vmstate_virtio_gpu_scanouts = { > > .name = "virtio-gpu-scanouts", > > - .version_id = 1, > > + .version_id = 2, > > + .minimum_version_id = 1, > > .fields = (const VMStateField[]) { > > VMSTATE_INT32(parent_obj.enable, struct VirtIOGPU), > > VMSTATE_UINT32_EQUAL(parent_obj.conf.max_outputs, > > struct VirtIOGPU, NULL), > > - VMSTATE_STRUCT_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > - parent_obj.conf.max_outputs, 1, > > - vmstate_virtio_gpu_scanout, > > - struct virtio_gpu_scanout), > > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > + vmstate_before_v2, > > + parent_obj.conf.max_outputs, 1, > > + vmstate_virtio_gpu_scanout, > > + struct virtio_gpu_scanout, 1), > > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > + NULL, > > + parent_obj.conf.max_outputs, 2, > > + vmstate_virtio_gpu_scanout, > > + struct virtio_gpu_scanout, 2), > > Personally I really wished struct_version_id never existed.. After these > years we only have 1 user of it (hw/ipmi/isa_ipmi_kcs.c), and we need to > keep that working. I'm wondering whether there's way we can avoid adding > one more user, and even more complicated.. > > I think I get the reasoning behind "we define the same thing twice", but > this is really tricky and definitely needs rich documents, meanwhiel all of > these seem to rely on so many small details: one set .field_exists > properly, one leaving it to NULL; also the two versionings used here for > both parent vmsd, and the struct, and for each entry we need to set > different versions to different fields.. > > Would it work if we only make the new fields under control with > vmstate_before_v2()? IOW, making all below with > .field_exists=vmstate_before_v2, so skip below when machine type is old? The existing VMSTATE_STRUCT_VARRAY_UINT32 would always use the latest vmstate_virtio_gpu_scanout version_id (2 in master). The "struct_version_id" solution allows setting the vmstate_virtio_gpu_scanout version_id to 1 if the parent struct (vmstate_virtio_gpu_scanouts) is 1, and 2 if the parent is 2. Since we don't have per VMSD version information on the wire, nested struct versioning is quite limited and cumbersome. I am not sure it can be changed without breaking the stream format, and whether it's worthwhile. > > VMSTATE_UINT32_V(fb.format, struct virtio_gpu_scanout, 2), > VMSTATE_UINT32_V(fb.bytes_pp, struct virtio_gpu_scanout, 2), > VMSTATE_UINT32_V(fb.width, struct virtio_gpu_scanout, 2), > VMSTATE_UINT32_V(fb.height, struct virtio_gpu_scanout, 2), > VMSTATE_UINT32_V(fb.stride, struct virtio_gpu_scanout, 2), > VMSTATE_UINT32_V(fb.offset, struct virtio_gpu_scanout, 2), > > Thanks, > > -- > Peter Xu > > -- Marc-André Lureau ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-10 8:39 ` Marc-André Lureau @ 2024-05-10 17:33 ` Peter Xu 2024-05-11 6:44 ` Marc-André Lureau 0 siblings, 1 reply; 16+ messages in thread From: Peter Xu @ 2024-05-10 17:33 UTC (permalink / raw) To: Marc-André Lureau Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang Hi, Marc-André, On Fri, May 10, 2024 at 12:39:34PM +0400, Marc-André Lureau wrote: > Since we don't have per VMSD version information on the wire, nested > struct versioning is quite limited and cumbersome. I am not sure it > can be changed without breaking the stream format, and whether it's > worthwhile. Right that's a major pain, and actually I just notice it.. I think it'll be much, much simpler if we keep vmsd version on the wire for each VMSD (including struct fields), then it makes more sense to me. Then when I went back and see again the VSTRUCT thing... I can hardly understand what it is doing, and also how it works at all. Look at the current only IPMI user, who has: VMSTATE_VSTRUCT_V(kcs, ISAIPMIKCSDevice, 2, vmstate_IPMIKCS, IPMIKCS, 2), It is setting both vmsd version and struct_version to 2. I can't tell why it matters then if anyway both of the fields are the same.. When we do save(), there is: } else if (field->flags & VMS_STRUCT) { ret = vmstate_save_state(f, field->vmsd, curr_elem, vmdesc_loop); } else if (field->flags & VMS_VSTRUCT) { ret = vmstate_save_state_v(f, field->vmsd, curr_elem, vmdesc_loop, field->struct_version_id, errp); When we load(): } else if (field->flags & VMS_STRUCT) { ret = vmstate_load_state(f, field->vmsd, curr_elem, field->vmsd->version_id); } else if (field->flags & VMS_VSTRUCT) { ret = vmstate_load_state(f, field->vmsd, curr_elem, field->struct_version_id); } else { In this case, passing in struct_version==version should have zero effect afaict, because the default behavior is passing in vmsd->version_id anyway. Moreover, now I highly doubt whether the VMS_STRUCT whole thing makes sense at all as you mentioned. Especially on the load side, here we should rely on vmstate_load_state() taking the last parameter as version_id on the wire. Here we're passing in the struct's version_id or struct_version_id, and neither of them makes sense to me... if we miss that version_id information, afaiu we should simply fix it and put it on the wire.. It'll break migration, we may need to work that out, but I don't see a better way. Keeping it like this like a nightmare to me.. :-( Irrelevant of all these mess.. For this specific problem, what I meant is exactly what Michael was requesting too (hopefully), I'd want to avoid further extending the complexity in this area. I have a patch attached at last which I also tested 8.2<->9.0 bi-directional migrations and it worked for me when I smoked it. Please have a look to see whether that makes sense and at the meantime avoid most of the tricks. I'd also like to mention one more thing just in case this can cause some more attention to virtio guys.. Normally I ran vmstate-static-checker.py before softfreeze, and I did it for 9.0 too without seeing this problem. It isn't raised because all virtio devices are using the "self managed" VMSTATE_VIRTIO_DEVICE to migrate. In that case I am out of luck. We can further extend what Fabiano mentioned in the other thread to cover migration stream validations in the future, but just to mention IMHO that needs extra work, and may work most likely the same as vmstate static checker but just waste many more cpu resources. It'll be good if someone could still help move virtio towards like most of the rest devices, or at least get covered by the static checker, too. But that definitely is a separate topic too.. so we can address the immediate breakage first. Thanks, ==8<== From a24ef99670fa7102da461d795aed4a957bad86b1 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Fri, 10 May 2024 12:33:34 -0400 Subject: [PATCH] fix gpu Signed-off-by: Peter Xu <peterx@redhat.com> --- include/hw/virtio/virtio-gpu.h | 2 +- hw/core/machine.c | 1 + hw/display/virtio-gpu.c | 21 +++++++++++++++------ 3 files changed, 17 insertions(+), 7 deletions(-) diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h index ed44cdad6b..e128501bdc 100644 --- a/include/hw/virtio/virtio-gpu.h +++ b/include/hw/virtio/virtio-gpu.h @@ -176,7 +176,7 @@ typedef struct VGPUDMABuf { struct VirtIOGPU { VirtIOGPUBase parent_obj; - + uint8_t vmstate_version; uint64_t conf_max_hostmem; VirtQueue *ctrl_vq; diff --git a/hw/core/machine.c b/hw/core/machine.c index 4ff60911e7..8f6f0dda7c 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -42,6 +42,7 @@ GlobalProperty hw_compat_8_2[] = { { "migration", "zero-page-detection", "legacy"}, { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" }, { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" }, + { "virtio-gpu-device", "x-vmstate-version", "1" }, }; const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2); diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index ae831b6b3e..c53f55404c 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -1166,6 +1166,14 @@ static void virtio_gpu_cursor_bh(void *opaque) virtio_gpu_handle_cursor(&g->parent_obj.parent_obj, g->cursor_vq); } +static bool vmstate_after_v2(void *opaque, int version) +{ + struct VirtIOGPUBase *base = container_of(opaque, VirtIOGPUBase, scanout); + struct VirtIOGPU *gpu = container_of(base, VirtIOGPU, parent_obj); + + return gpu->vmstate_version >= 2; +} + static const VMStateDescription vmstate_virtio_gpu_scanout = { .name = "virtio-gpu-one-scanout", .version_id = 2, @@ -1181,12 +1189,12 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { VMSTATE_UINT32(cursor.hot_y, struct virtio_gpu_scanout), VMSTATE_UINT32(cursor.pos.x, struct virtio_gpu_scanout), VMSTATE_UINT32(cursor.pos.y, struct virtio_gpu_scanout), - VMSTATE_UINT32_V(fb.format, struct virtio_gpu_scanout, 2), - VMSTATE_UINT32_V(fb.bytes_pp, struct virtio_gpu_scanout, 2), - VMSTATE_UINT32_V(fb.width, struct virtio_gpu_scanout, 2), - VMSTATE_UINT32_V(fb.height, struct virtio_gpu_scanout, 2), - VMSTATE_UINT32_V(fb.stride, struct virtio_gpu_scanout, 2), - VMSTATE_UINT32_V(fb.offset, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_TEST(fb.format, struct virtio_gpu_scanout,vmstate_after_v2), + VMSTATE_UINT32_TEST(fb.bytes_pp, struct virtio_gpu_scanout,vmstate_after_v2), + VMSTATE_UINT32_TEST(fb.width, struct virtio_gpu_scanout,vmstate_after_v2), + VMSTATE_UINT32_TEST(fb.height, struct virtio_gpu_scanout,vmstate_after_v2), + VMSTATE_UINT32_TEST(fb.stride, struct virtio_gpu_scanout,vmstate_after_v2), + VMSTATE_UINT32_TEST(fb.offset, struct virtio_gpu_scanout,vmstate_after_v2), VMSTATE_END_OF_LIST() }, }; @@ -1659,6 +1667,7 @@ static Property virtio_gpu_properties[] = { DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags, VIRTIO_GPU_FLAG_BLOB_ENABLED, false), DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0), + DEFINE_PROP_UINT8("x-vmstate-version", VirtIOGPU, vmstate_version, 2), DEFINE_PROP_END_OF_LIST(), }; -- 2.44.0 -- Peter Xu ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-10 17:33 ` Peter Xu @ 2024-05-11 6:44 ` Marc-André Lureau 0 siblings, 0 replies; 16+ messages in thread From: Marc-André Lureau @ 2024-05-11 6:44 UTC (permalink / raw) To: Peter Xu Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang Hi Peter On Fri, May 10, 2024 at 9:33 PM Peter Xu <peterx@redhat.com> wrote: > > Hi, Marc-André, > > On Fri, May 10, 2024 at 12:39:34PM +0400, Marc-André Lureau wrote: > > Since we don't have per VMSD version information on the wire, nested > > struct versioning is quite limited and cumbersome. I am not sure it > > can be changed without breaking the stream format, and whether it's > > worthwhile. > > Right that's a major pain, and actually I just notice it.. > > I think it'll be much, much simpler if we keep vmsd version on the wire for > each VMSD (including struct fields), then it makes more sense to me. > > Then when I went back and see again the VSTRUCT thing... I can hardly > understand what it is doing, and also how it works at all. > > Look at the current only IPMI user, who has: > > VMSTATE_VSTRUCT_V(kcs, ISAIPMIKCSDevice, 2, vmstate_IPMIKCS, > IPMIKCS, 2), > > It is setting both vmsd version and struct_version to 2. I can't tell why > it matters then if anyway both of the fields are the same.. > > When we do save(), there is: > > } else if (field->flags & VMS_STRUCT) { > ret = vmstate_save_state(f, field->vmsd, curr_elem, > vmdesc_loop); > } else if (field->flags & VMS_VSTRUCT) { > ret = vmstate_save_state_v(f, field->vmsd, curr_elem, > vmdesc_loop, > field->struct_version_id, errp); > > When we load(): > > } else if (field->flags & VMS_STRUCT) { > ret = vmstate_load_state(f, field->vmsd, curr_elem, > field->vmsd->version_id); > } else if (field->flags & VMS_VSTRUCT) { > ret = vmstate_load_state(f, field->vmsd, curr_elem, > field->struct_version_id); > } else { > > In this case, passing in struct_version==version should have zero effect > afaict, because the default behavior is passing in vmsd->version_id anyway. IPMI KCS being a top-level section, the fields with an unsupported version are filtered before reaching this code. But since I can't see how a machine will have a specific version, it only helps for backward migration, which is quite limited. > > Moreover, now I highly doubt whether the VMS_STRUCT whole thing makes sense > at all as you mentioned. Especially on the load side, here we should rely > on vmstate_load_state() taking the last parameter as version_id on the > wire. Here we're passing in the struct's version_id or struct_version_id, > and neither of them makes sense to me... if we miss that version_id > information, afaiu we should simply fix it and put it on the wire.. It'll > break migration, we may need to work that out, but I don't see a better > way. Keeping it like this like a nightmare to me.. :-( Ack. Do you think we should add a version on the wire for each VMSD? that will likely be a format change. > > Irrelevant of all these mess.. For this specific problem, what I meant is > exactly what Michael was requesting too (hopefully), I'd want to avoid > further extending the complexity in this area. I have a patch attached at > last which I also tested 8.2<->9.0 bi-directional migrations and it worked > for me when I smoked it. Please have a look to see whether that makes > sense and at the meantime avoid most of the tricks. Works for me! thanks for figuring out how to get back the VirtioGPU* ! I'll send v2 with your patch. > > I'd also like to mention one more thing just in case this can cause some > more attention to virtio guys.. > > Normally I ran vmstate-static-checker.py before softfreeze, and I did it > for 9.0 too without seeing this problem. It isn't raised because all > virtio devices are using the "self managed" VMSTATE_VIRTIO_DEVICE to > migrate. In that case I am out of luck. We can further extend what > Fabiano mentioned in the other thread to cover migration stream validations > in the future, but just to mention IMHO that needs extra work, and may work > most likely the same as vmstate static checker but just waste many more cpu > resources. It'll be good if someone could still help move virtio towards > like most of the rest devices, or at least get covered by the static > checker, too. But that definitely is a separate topic too.. so we can > address the immediate breakage first. > > Thanks, > > ==8<== > From a24ef99670fa7102da461d795aed4a957bad86b1 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Fri, 10 May 2024 12:33:34 -0400 > Subject: [PATCH] fix gpu > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/hw/virtio/virtio-gpu.h | 2 +- > hw/core/machine.c | 1 + > hw/display/virtio-gpu.c | 21 +++++++++++++++------ > 3 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h > index ed44cdad6b..e128501bdc 100644 > --- a/include/hw/virtio/virtio-gpu.h > +++ b/include/hw/virtio/virtio-gpu.h > @@ -176,7 +176,7 @@ typedef struct VGPUDMABuf { > > struct VirtIOGPU { > VirtIOGPUBase parent_obj; > - > + uint8_t vmstate_version; > uint64_t conf_max_hostmem; > > VirtQueue *ctrl_vq; > diff --git a/hw/core/machine.c b/hw/core/machine.c > index 4ff60911e7..8f6f0dda7c 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -42,6 +42,7 @@ GlobalProperty hw_compat_8_2[] = { > { "migration", "zero-page-detection", "legacy"}, > { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" }, > { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" }, > + { "virtio-gpu-device", "x-vmstate-version", "1" }, > }; > const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2); > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > index ae831b6b3e..c53f55404c 100644 > --- a/hw/display/virtio-gpu.c > +++ b/hw/display/virtio-gpu.c > @@ -1166,6 +1166,14 @@ static void virtio_gpu_cursor_bh(void *opaque) > virtio_gpu_handle_cursor(&g->parent_obj.parent_obj, g->cursor_vq); > } > > +static bool vmstate_after_v2(void *opaque, int version) > +{ > + struct VirtIOGPUBase *base = container_of(opaque, VirtIOGPUBase, scanout); > + struct VirtIOGPU *gpu = container_of(base, VirtIOGPU, parent_obj); > + > + return gpu->vmstate_version >= 2; > +} > + > static const VMStateDescription vmstate_virtio_gpu_scanout = { > .name = "virtio-gpu-one-scanout", > .version_id = 2, > @@ -1181,12 +1189,12 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { > VMSTATE_UINT32(cursor.hot_y, struct virtio_gpu_scanout), > VMSTATE_UINT32(cursor.pos.x, struct virtio_gpu_scanout), > VMSTATE_UINT32(cursor.pos.y, struct virtio_gpu_scanout), > - VMSTATE_UINT32_V(fb.format, struct virtio_gpu_scanout, 2), > - VMSTATE_UINT32_V(fb.bytes_pp, struct virtio_gpu_scanout, 2), > - VMSTATE_UINT32_V(fb.width, struct virtio_gpu_scanout, 2), > - VMSTATE_UINT32_V(fb.height, struct virtio_gpu_scanout, 2), > - VMSTATE_UINT32_V(fb.stride, struct virtio_gpu_scanout, 2), > - VMSTATE_UINT32_V(fb.offset, struct virtio_gpu_scanout, 2), > + VMSTATE_UINT32_TEST(fb.format, struct virtio_gpu_scanout,vmstate_after_v2), > + VMSTATE_UINT32_TEST(fb.bytes_pp, struct virtio_gpu_scanout,vmstate_after_v2), > + VMSTATE_UINT32_TEST(fb.width, struct virtio_gpu_scanout,vmstate_after_v2), > + VMSTATE_UINT32_TEST(fb.height, struct virtio_gpu_scanout,vmstate_after_v2), > + VMSTATE_UINT32_TEST(fb.stride, struct virtio_gpu_scanout,vmstate_after_v2), > + VMSTATE_UINT32_TEST(fb.offset, struct virtio_gpu_scanout,vmstate_after_v2), > VMSTATE_END_OF_LIST() > }, > }; > @@ -1659,6 +1667,7 @@ static Property virtio_gpu_properties[] = { > DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags, > VIRTIO_GPU_FLAG_BLOB_ENABLED, false), > DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0), > + DEFINE_PROP_UINT8("x-vmstate-version", VirtIOGPU, vmstate_version, 2), > DEFINE_PROP_END_OF_LIST(), > }; > > -- > 2.44.0 > > > -- > Peter Xu > -- Marc-André Lureau ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-07 11:19 ` [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field marcandre.lureau 2024-05-07 19:59 ` Peter Xu @ 2024-05-10 10:25 ` Michael S. Tsirkin 2024-05-10 11:57 ` Marc-André Lureau 1 sibling, 1 reply; 16+ messages in thread From: Michael S. Tsirkin @ 2024-05-10 10:25 UTC (permalink / raw) To: marcandre.lureau Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Yanan Wang On Tue, May 07, 2024 at 03:19:19PM +0400, marcandre.lureau@redhat.com wrote: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Depending on the version, use v1 or v2 of the scanout VM state. > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > --- > hw/display/virtio-gpu.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > index ae831b6b3e..4fd72caf3f 100644 > --- a/hw/display/virtio-gpu.c > +++ b/hw/display/virtio-gpu.c > @@ -1191,17 +1191,29 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { > }, > }; > > +static bool vmstate_before_v2(void *opaque, int version) > +{ > + return version <= 1; > +} > + > static const VMStateDescription vmstate_virtio_gpu_scanouts = { > .name = "virtio-gpu-scanouts", > - .version_id = 1, > + .version_id = 2, > + .minimum_version_id = 1, > .fields = (const VMStateField[]) { > VMSTATE_INT32(parent_obj.enable, struct VirtIOGPU), > VMSTATE_UINT32_EQUAL(parent_obj.conf.max_outputs, > struct VirtIOGPU, NULL), > - VMSTATE_STRUCT_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > - parent_obj.conf.max_outputs, 1, > - vmstate_virtio_gpu_scanout, > - struct virtio_gpu_scanout), > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > + vmstate_before_v2, > + parent_obj.conf.max_outputs, 1, > + vmstate_virtio_gpu_scanout, > + struct virtio_gpu_scanout, 1), > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > + NULL, > + parent_obj.conf.max_outputs, 2, > + vmstate_virtio_gpu_scanout, > + struct virtio_gpu_scanout, 2), > VMSTATE_END_OF_LIST() > }, Just don't, please. Add a property and add a conditional field based on property, set from the compat machinery. > }; > -- > 2.41.0.28.gd7d8841f67 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field 2024-05-10 10:25 ` Michael S. Tsirkin @ 2024-05-10 11:57 ` Marc-André Lureau 0 siblings, 0 replies; 16+ messages in thread From: Marc-André Lureau @ 2024-05-10 11:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Yanan Wang Hi Michael On Fri, May 10, 2024 at 2:26 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Tue, May 07, 2024 at 03:19:19PM +0400, marcandre.lureau@redhat.com wrote: > > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > > > Depending on the version, use v1 or v2 of the scanout VM state. > > > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > > --- > > hw/display/virtio-gpu.c | 22 +++++++++++++++++----- > > 1 file changed, 17 insertions(+), 5 deletions(-) > > > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > > index ae831b6b3e..4fd72caf3f 100644 > > --- a/hw/display/virtio-gpu.c > > +++ b/hw/display/virtio-gpu.c > > @@ -1191,17 +1191,29 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { > > }, > > }; > > > > +static bool vmstate_before_v2(void *opaque, int version) > > +{ > > + return version <= 1; > > +} > > + > > static const VMStateDescription vmstate_virtio_gpu_scanouts = { > > .name = "virtio-gpu-scanouts", > > - .version_id = 1, > > + .version_id = 2, > > + .minimum_version_id = 1, > > .fields = (const VMStateField[]) { > > VMSTATE_INT32(parent_obj.enable, struct VirtIOGPU), > > VMSTATE_UINT32_EQUAL(parent_obj.conf.max_outputs, > > struct VirtIOGPU, NULL), > > - VMSTATE_STRUCT_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > - parent_obj.conf.max_outputs, 1, > > - vmstate_virtio_gpu_scanout, > > - struct virtio_gpu_scanout), > > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > + vmstate_before_v2, > > + parent_obj.conf.max_outputs, 1, > > + vmstate_virtio_gpu_scanout, > > + struct virtio_gpu_scanout, 1), > > + VMSTATE_VSTRUCT_TEST_VARRAY_UINT32(parent_obj.scanout, struct VirtIOGPU, > > + NULL, > > + parent_obj.conf.max_outputs, 2, > > + vmstate_virtio_gpu_scanout, > > + struct virtio_gpu_scanout, 2), > > VMSTATE_END_OF_LIST() > > }, > > > Just don't, please. > Add a property and add a conditional field based on property, set > from the compat machinery. > The version isn't propagated through the nested VMSDs, so vmstate_virtio_gpu_scanout would only have v1, whether we have a field test or not. Can you be more explicit what alternative solution you propose? thanks -- Marc-André Lureau ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/4] virtio-gpu: add x-vmstate-version 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau ` (2 preceding siblings ...) 2024-05-07 11:19 ` [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field marcandre.lureau @ 2024-05-07 11:19 ` marcandre.lureau 2024-05-07 20:51 ` Fabiano Rosas 2024-05-07 20:46 ` [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" Fabiano Rosas 2024-05-08 9:51 ` Fiona Ebner 5 siblings, 1 reply; 16+ messages in thread From: marcandre.lureau @ 2024-05-07 11:19 UTC (permalink / raw) To: qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau From: Marc-André Lureau <marcandre.lureau@redhat.com> Machine <= 8.2 use v1. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> --- include/hw/virtio/virtio-gpu.h | 1 + hw/core/machine.c | 1 + hw/display/virtio-gpu.c | 6 ++++-- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h index ed44cdad6b..af1c77eb3f 100644 --- a/include/hw/virtio/virtio-gpu.h +++ b/include/hw/virtio/virtio-gpu.h @@ -177,6 +177,7 @@ typedef struct VGPUDMABuf { struct VirtIOGPU { VirtIOGPUBase parent_obj; + uint8_t vmstate_version; uint64_t conf_max_hostmem; VirtQueue *ctrl_vq; diff --git a/hw/core/machine.c b/hw/core/machine.c index 4ff60911e7..8f6f0dda7c 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -42,6 +42,7 @@ GlobalProperty hw_compat_8_2[] = { { "migration", "zero-page-detection", "legacy"}, { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" }, { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" }, + { "virtio-gpu-device", "x-vmstate-version", "1" }, }; const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2); diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index 4fd72caf3f..ee1cc1acf9 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -1246,7 +1246,8 @@ static int virtio_gpu_save(QEMUFile *f, void *opaque, size_t size, } qemu_put_be32(f, 0); /* end of list */ - return vmstate_save_state(f, &vmstate_virtio_gpu_scanouts, g, NULL); + return vmstate_save_state_v(f, &vmstate_virtio_gpu_scanouts, g, + NULL, g->vmstate_version, NULL); } static bool virtio_gpu_load_restore_mapping(VirtIOGPU *g, @@ -1351,7 +1352,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size, } /* load & apply scanout state */ - vmstate_load_state(f, &vmstate_virtio_gpu_scanouts, g, 1); + vmstate_load_state(f, &vmstate_virtio_gpu_scanouts, g, g->vmstate_version); return 0; } @@ -1671,6 +1672,7 @@ static Property virtio_gpu_properties[] = { DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags, VIRTIO_GPU_FLAG_BLOB_ENABLED, false), DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0), + DEFINE_PROP_UINT8("x-vmstate-version", VirtIOGPU, vmstate_version, 2), DEFINE_PROP_END_OF_LIST(), }; -- 2.41.0.28.gd7d8841f67 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] virtio-gpu: add x-vmstate-version 2024-05-07 11:19 ` [PATCH 4/4] virtio-gpu: add x-vmstate-version marcandre.lureau @ 2024-05-07 20:51 ` Fabiano Rosas 0 siblings, 0 replies; 16+ messages in thread From: Fabiano Rosas @ 2024-05-07 20:51 UTC (permalink / raw) To: marcandre.lureau, qemu-devel Cc: Richard Henderson, Sebastian Ott, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau marcandre.lureau@redhat.com writes: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Machine <= 8.2 use v1. > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau ` (3 preceding siblings ...) 2024-05-07 11:19 ` [PATCH 4/4] virtio-gpu: add x-vmstate-version marcandre.lureau @ 2024-05-07 20:46 ` Fabiano Rosas 2024-05-07 21:24 ` Peter Xu 2024-05-08 9:51 ` Fiona Ebner 5 siblings, 1 reply; 16+ messages in thread From: Fabiano Rosas @ 2024-05-07 20:46 UTC (permalink / raw) To: marcandre.lureau, qemu-devel Cc: Richard Henderson, Sebastian Ott, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang, Marc-André Lureau marcandre.lureau@redhat.com writes: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Hi, > > The aforementioned patch breaks virtio-gpu device migrations for versions > pre-9.0/9.0, both forwards and backwards. Versioning of `VMS_STRUCT` is more > complex than it may initially appear, as evidenced in the problematic commit > dfcf74fa68c ("virtio-gpu: fix scanout migration post-load"). > > To resolve this, we need to propagate the `vmstate` `version_id` through the > nested structures. Additionally, we should tie specific machine version to a > corresponding `version_id` to maintain migration compatibility. > > `VMS_VSTRUCT` allows specifying the appropriate version of the nested structure > to use. This would have been caught by the migration-compat-x86_64 CI job had we added the virtio-gpu device to it. $ cd build-8.2 $ QTEST_TRACE='vmstate_*' QTEST_DEVICE_OPTS='-device virtio-gpu' \ QTEST_QEMU_BINARY=./qemu-system-x86_64 \ QTEST_QEMU_BINARY_DST=../build-9.0/qemu-system-x86_64 ./tests/qtest/migration-test ... vmstate_n_elems fb.offset: 1 vmstate_subsection_load virtio-gpu-one-scanout vmstate_subsection_load_good virtio-gpu-one-scanout vmstate_load_state_end virtio-gpu-one-scanout end/0 vmstate_subsection_load virtio-gpu-scanouts vmstate_subsection_load_good virtio-gpu-scanouts vmstate_load_state_end virtio-gpu-scanouts end/0 vmstate_subsection_load virtio-gpu vmstate_subsection_load_good virtio-gpu vmstate_load_state_end virtio-gpu end/0 vmstate_downtime_load type=non-iterable idstr=0000:00:03.0/virtio-gpu instance_id=0 downtime=32118 qemu-system-x86_64: Missing section footer for 0000:00:03.0/virtio-gpu vmstate_downtime_checkpoint dst-precopy-loadvm-completed qemu-system-x86_64: load of migration failed: Invalid argument Some considerations: 1) Here QTEST_DEVICE_OPTS is a hack I added on top, it doesn't currently exist. 2) This only uncovers relatively simple bugs where we don't need the guest to access the device, it just needs to be there. We could take the steps to enable this kind of testing if we think it's worthwhile. Some downsides are: a) the item (2) above - situations that depend on guest behavior are out of the picture because migration-test runs only a custom program that dirties memory; b) this test only works in CI or in a pre setup environment because it needs the previous QEMU version to be built beforehand; c) the full set of migration tests already runs a few times in CI via make check, plus the compat job. We'll probably need to do some simplification to avoid taking too much additional time; d) there's also the obvious maintenance burden of choosing devices and doing the eventual upkeep of the QEMU command line for the migration-test. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" 2024-05-07 20:46 ` [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" Fabiano Rosas @ 2024-05-07 21:24 ` Peter Xu 0 siblings, 0 replies; 16+ messages in thread From: Peter Xu @ 2024-05-07 21:24 UTC (permalink / raw) To: Fabiano Rosas Cc: marcandre.lureau, qemu-devel, Richard Henderson, Sebastian Ott, Eduardo Habkost, Fiona Ebner, Gerd Hoffmann, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang On Tue, May 07, 2024 at 05:46:36PM -0300, Fabiano Rosas wrote: > marcandre.lureau@redhat.com writes: > > > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > > > Hi, > > > > The aforementioned patch breaks virtio-gpu device migrations for versions > > pre-9.0/9.0, both forwards and backwards. Versioning of `VMS_STRUCT` is more > > complex than it may initially appear, as evidenced in the problematic commit > > dfcf74fa68c ("virtio-gpu: fix scanout migration post-load"). > > > > To resolve this, we need to propagate the `vmstate` `version_id` through the > > nested structures. Additionally, we should tie specific machine version to a > > corresponding `version_id` to maintain migration compatibility. > > > > `VMS_VSTRUCT` allows specifying the appropriate version of the nested structure > > to use. > > This would have been caught by the migration-compat-x86_64 CI job had we > added the virtio-gpu device to it. I had exactly the same thoughts, and actually I added a todo after I noticed this issue but I forgot to discuss with you today.. actually I have one more on whether we can allow vmsd versioning to work for bi-direction (then we can get rid of machine type dependencies), we may need the handshake work as pre-requisite, so that two qemus need to talk first on verify the vmsds. But let's leave that for later. > > $ cd build-8.2 > $ QTEST_TRACE='vmstate_*' QTEST_DEVICE_OPTS='-device virtio-gpu' \ > QTEST_QEMU_BINARY=./qemu-system-x86_64 \ > QTEST_QEMU_BINARY_DST=../build-9.0/qemu-system-x86_64 ./tests/qtest/migration-test > ... > vmstate_n_elems fb.offset: 1 > vmstate_subsection_load virtio-gpu-one-scanout > vmstate_subsection_load_good virtio-gpu-one-scanout > vmstate_load_state_end virtio-gpu-one-scanout end/0 > vmstate_subsection_load virtio-gpu-scanouts > vmstate_subsection_load_good virtio-gpu-scanouts > vmstate_load_state_end virtio-gpu-scanouts end/0 > vmstate_subsection_load virtio-gpu > vmstate_subsection_load_good virtio-gpu > vmstate_load_state_end virtio-gpu end/0 > vmstate_downtime_load type=non-iterable idstr=0000:00:03.0/virtio-gpu instance_id=0 downtime=32118 > qemu-system-x86_64: Missing section footer for 0000:00:03.0/virtio-gpu > vmstate_downtime_checkpoint dst-precopy-loadvm-completed > qemu-system-x86_64: load of migration failed: Invalid argument > > Some considerations: > > 1) Here QTEST_DEVICE_OPTS is a hack I added on top, it doesn't currently > exist. > > 2) This only uncovers relatively simple bugs where we don't need the > guest to access the device, it just needs to be there. Right, but having something to cover the basics would still be nice, because the rarer the bug the less impact to users too (I hope!), so they can be with lower priority too when tested. > > We could take the steps to enable this kind of testing if we think it's > worthwhile. Some downsides are: > > a) the item (2) above - situations that depend on guest behavior are out > of the picture because migration-test runs only a custom program that > dirties memory; > > b) this test only works in CI or in a pre setup environment because it > needs the previous QEMU version to be built beforehand; > > c) the full set of migration tests already runs a few times in CI via > make check, plus the compat job. We'll probably need to do some > simplification to avoid taking too much additional time; > > d) there's also the obvious maintenance burden of choosing devices and > doing the eventual upkeep of the QEMU command line for the > migration-test. The last one should be mostly fine, iiuc - we shouldn't add any device that can easily break ABI, and the list of devices can start with a minumum; an extreme case is we add one device only if something broke first with a stable ABI. Then if the ABI is stable, we need to go through the deprecation procedure, and that takes two releases. We can drop the device in migration-test in the release of when deprecation starts. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau ` (4 preceding siblings ...) 2024-05-07 20:46 ` [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" Fabiano Rosas @ 2024-05-08 9:51 ` Fiona Ebner 5 siblings, 0 replies; 16+ messages in thread From: Fiona Ebner @ 2024-05-08 9:51 UTC (permalink / raw) To: marcandre.lureau, qemu-devel Cc: Richard Henderson, Sebastian Ott, Fabiano Rosas, Eduardo Habkost, Gerd Hoffmann, Peter Xu, Philippe Mathieu-Daudé, Marcel Apfelbaum, peter.maydell, Michael S. Tsirkin, Yanan Wang Am 07.05.24 um 13:19 schrieb marcandre.lureau@redhat.com: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > Hi, > > The aforementioned patch breaks virtio-gpu device migrations for versions > pre-9.0/9.0, both forwards and backwards. Versioning of `VMS_STRUCT` is more > complex than it may initially appear, as evidenced in the problematic commit > dfcf74fa68c ("virtio-gpu: fix scanout migration post-load"). > > To resolve this, we need to propagate the `vmstate` `version_id` through the > nested structures. Additionally, we should tie specific machine version to a > corresponding `version_id` to maintain migration compatibility. > > `VMS_VSTRUCT` allows specifying the appropriate version of the nested structure > to use. > > Marc-André Lureau (4): > migration: add "exists" info to load-state-field trace > include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 > virtio-gpu: use a VMState variant for the scanout field > virtio-gpu: add x-vmstate-version > > include/hw/virtio/virtio-gpu.h | 1 + > include/migration/vmstate.h | 12 ++++++++++++ > hw/core/machine.c | 1 + > hw/display/virtio-gpu.c | 28 +++++++++++++++++++++------- > migration/vmstate.c | 5 +++-- > migration/trace-events | 2 +- > 6 files changed, 39 insertions(+), 10 deletions(-) > Tested-by: Fiona Ebner <f.ebner@proxmox.com> Thank you for the fix! Tested with an Ubuntu 23.10 VM: Machine type pc-i440fx-8.2: 1. create snapshot with 8.2, load with patched 9.0 2. create snapshot with patched 9.0, load with patched 9.0 and with 8.2 Machine type pc-i440fx-9.0: 1. create snapshot with patched 9.0, load with patched 9.0 No crashes/failures and didn't notice any other issues :) Best Regards, Fiona ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-05-11 6:45 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-07 11:19 [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" marcandre.lureau 2024-05-07 11:19 ` [PATCH 1/4] migration: add "exists" info to load-state-field trace marcandre.lureau 2024-05-07 15:29 ` Peter Xu 2024-05-07 11:19 ` [PATCH 2/4] include/migration: add VMSTATE_VSTRUCT_TEST_VARRAY_UINT32 marcandre.lureau 2024-05-07 11:19 ` [PATCH 3/4] virtio-gpu: use a VMState variant for the scanout field marcandre.lureau 2024-05-07 19:59 ` Peter Xu 2024-05-10 8:39 ` Marc-André Lureau 2024-05-10 17:33 ` Peter Xu 2024-05-11 6:44 ` Marc-André Lureau 2024-05-10 10:25 ` Michael S. Tsirkin 2024-05-10 11:57 ` Marc-André Lureau 2024-05-07 11:19 ` [PATCH 4/4] virtio-gpu: add x-vmstate-version marcandre.lureau 2024-05-07 20:51 ` Fabiano Rosas 2024-05-07 20:46 ` [PATCH 0/4] Fix "virtio-gpu: fix scanout migration post-load" Fabiano Rosas 2024-05-07 21:24 ` Peter Xu 2024-05-08 9:51 ` Fiona Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).