All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] migration: Fix s390 regressions + migration script
@ 2025-01-09 18:52 Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 1/7] migration: Add more error handling to analyze-migration.py Fabiano Rosas
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

changes:

- fixed appending again
- changed the VMSDFieldNull class to inherit from VMSDFieldGeneric

v2:
https://lore.kernel.org/r/20250109140959.19464-1-farosas@suse.de

- dropped comments patch
- new patch 4: rename the field to nullptr
- patch 6: add a sample JSON, fix the appending code

v1:
https://lore.kernel.org/r/20250107195025.9951-1-farosas@suse.de

Hi,

The situation that broke the last migration PR was:

1) emitting of JSON data by QEMU for
   VMSTATE_ARRAY_OF_POINTER_TO_STRUCT when NULL pointers are present
   has been broken for a while;

2) parsing of s390x migration stream by analyze-script.py has been
   broken for a while;

   (there's indications that it worked on s390x hosts, I'm assuming due
   to byte order coincidences)

3) s390x CSS migration has been broken for a while;

The s390x CSS migration uses VMSTATE_ARRAY_OF_POINTER_TO_STRUCT with
NULL pointers, triggering #1, but hidden due to #2 on TCG hosts and
due to #3 overall.

- patches 1: just to make rebase easier
- patches 2-3: cleanups
- patch 4: fixes #2
- patches 5-6: fix #1
- patch 7: fixes #3

Fabiano Rosas (6):
  migration: Add more error handling to analyze-migration.py
  migration: Remove unused argument in vmsd_desc_field_end
  migration: Fix parsing of s390 stream
  migration: Rename vmstate_info_nullptr
  migration: Fix arrays of pointers in JSON writer
  s390x: Fix CSS migration

Peter Xu (1):
  migration: Dump correct JSON format for nullptr replacement

 hw/s390x/s390-virtio-ccw.c   |   2 +-
 migration/vmstate-types.c    |   2 +-
 migration/vmstate.c          | 151 ++++++++++++++++++++++++++++-------
 scripts/analyze-migration.py | 142 +++++++++++++++++++++++---------
 4 files changed, 228 insertions(+), 69 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/7] migration: Add more error handling to analyze-migration.py
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 2/7] migration: Remove unused argument in vmsd_desc_field_end Fabiano Rosas
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

The analyze-migration script was seen failing in s390x in misterious
ways. It seems we're reaching the VMSDFieldStruct constructor without
any fields, which would indicate an empty .subsection entry, a
VMSTATE_STRUCT with no fields or a vmsd with no fields. We don't have
any of those, at least not without the unmigratable flag set, so this
should never happen.

Add some debug statements so that we can see what's going on the next
time the issue happens.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20250103141305.8435-1-farosas@suse.de>
---
 scripts/analyze-migration.py | 75 +++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 30 deletions(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 8a254a5b6a..f2457b1dde 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -429,6 +429,9 @@ def __init__(self, desc, file):
         super(VMSDFieldStruct, self).__init__(desc, file)
         self.data = collections.OrderedDict()
 
+        if 'fields' not in self.desc['struct']:
+            raise Exception("No fields in struct. VMSD:\n%s" % self.desc)
+
         # When we see compressed array elements, unfold them here
         new_fields = []
         for field in self.desc['struct']['fields']:
@@ -477,6 +480,10 @@ def read(self):
                     raise Exception("Subsection %s not found at offset %x" % ( subsection['vmsd_name'], self.file.tell()))
                 name = self.file.readstr()
                 version_id = self.file.read32()
+
+                if not subsection:
+                    raise Exception("Empty description for subsection: %s" % name)
+
                 self.data[name] = VMSDSection(self.file, version_id, subsection, (name, 0))
                 self.data[name].read()
 
@@ -574,10 +581,13 @@ def __init__(self, filename):
         }
         self.filename = filename
         self.vmsd_desc = None
+        self.vmsd_json = ""
 
-    def read(self, desc_only = False, dump_memory = False, write_memory = False):
+    def read(self, desc_only = False, dump_memory = False,
+             write_memory = False):
         # Read in the whole file
         file = MigrationFile(self.filename)
+        self.vmsd_json = file.read_migration_debug_json()
 
         # File magic
         data = file.read32()
@@ -635,9 +645,11 @@ def read(self, desc_only = False, dump_memory = False, write_memory = False):
         file.close()
 
     def load_vmsd_json(self, file):
-        vmsd_json = file.read_migration_debug_json()
-        self.vmsd_desc = json.loads(vmsd_json, object_pairs_hook=collections.OrderedDict)
+        self.vmsd_desc = json.loads(self.vmsd_json,
+                                    object_pairs_hook=collections.OrderedDict)
         for device in self.vmsd_desc['devices']:
+            if 'fields' not in device:
+                raise Exception("vmstate for device %s has no fields" % device['name'])
             key = (device['name'], device['instance_id'])
             value = ( VMSDSection, device )
             self.section_classes[key] = value
@@ -666,31 +678,34 @@ def default(self, o):
 
 jsonenc = JSONEncoder(indent=4, separators=(',', ': '))
 
-if args.extract:
-    dump = MigrationDump(args.file)
-
-    dump.read(desc_only = True)
-    print("desc.json")
-    f = open("desc.json", "w")
-    f.truncate()
-    f.write(jsonenc.encode(dump.vmsd_desc))
-    f.close()
-
-    dump.read(write_memory = True)
-    dict = dump.getDict()
-    print("state.json")
-    f = open("state.json", "w")
-    f.truncate()
-    f.write(jsonenc.encode(dict))
-    f.close()
-elif args.dump == "state":
-    dump = MigrationDump(args.file)
-    dump.read(dump_memory = args.memory)
-    dict = dump.getDict()
-    print(jsonenc.encode(dict))
-elif args.dump == "desc":
-    dump = MigrationDump(args.file)
-    dump.read(desc_only = True)
-    print(jsonenc.encode(dump.vmsd_desc))
-else:
+if not any([args.extract, args.dump == "state", args.dump == "desc"]):
     raise Exception("Please specify either -x, -d state or -d desc")
+
+try:
+    dump = MigrationDump(args.file)
+
+    if args.extract:
+        dump.read(desc_only = True)
+
+        print("desc.json")
+        f = open("desc.json", "w")
+        f.truncate()
+        f.write(jsonenc.encode(dump.vmsd_desc))
+        f.close()
+
+        dump.read(write_memory = True)
+        dict = dump.getDict()
+        print("state.json")
+        f = open("state.json", "w")
+        f.truncate()
+        f.write(jsonenc.encode(dict))
+        f.close()
+    elif args.dump == "state":
+        dump.read(dump_memory = args.memory)
+        dict = dump.getDict()
+        print(jsonenc.encode(dict))
+    elif args.dump == "desc":
+        dump.read(desc_only = True)
+        print(jsonenc.encode(dump.vmsd_desc))
+except Exception:
+    raise Exception("Full JSON dump:\n%s", dump.vmsd_json)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/7] migration: Remove unused argument in vmsd_desc_field_end
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 1/7] migration: Add more error handling to analyze-migration.py Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 3/7] migration: Fix parsing of s390 stream Fabiano Rosas
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/vmstate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index fa002b24e8..aa2821dec6 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -311,7 +311,7 @@ static void vmsd_desc_field_start(const VMStateDescription *vmsd,
 
 static void vmsd_desc_field_end(const VMStateDescription *vmsd,
                                 JSONWriter *vmdesc,
-                                const VMStateField *field, size_t size, int i)
+                                const VMStateField *field, size_t size)
 {
     if (!vmdesc) {
         return;
@@ -420,7 +420,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                 }
 
                 written_bytes = qemu_file_transferred(f) - old_offset;
-                vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes, i);
+                vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes);
 
                 /* Compressed arrays only care about the first element */
                 if (vmdesc_loop && vmsd_can_compress(field)) {
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 1/7] migration: Add more error handling to analyze-migration.py Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 2/7] migration: Remove unused argument in vmsd_desc_field_end Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-12 13:06   ` Michael Tokarev
  2025-01-09 18:52 ` [PATCH v3 4/7] migration: Rename vmstate_info_nullptr Fabiano Rosas
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

The parsing for the S390StorageAttributes section is currently leaving
an unconsumed token that is later interpreted by the generic code as
QEMU_VM_EOF, cutting the parsing short.

The migration will issue a STATTR_FLAG_DONE between iterations, which
the script consumes correctly, but there's a final STATTR_FLAG_EOS at
.save_complete that the script is ignoring. Since the EOS flag is a
u64 0x1ULL and the stream is big endian, on little endian hosts a byte
read from it will be 0x0, the same as QEMU_VM_EOF.

Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 scripts/analyze-migration.py | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index f2457b1dde..fcda11f31d 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -65,6 +65,9 @@ def readvar(self, size = None):
     def tell(self):
         return self.file.tell()
 
+    def seek(self, a, b):
+        return self.file.seek(a, b)
+
     # The VMSD description is at the end of the file, after EOF. Look for
     # the last NULL byte, then for the beginning brace of JSON.
     def read_migration_debug_json(self):
@@ -272,11 +275,24 @@ def __init__(self, file, version_id, device, section_key):
         self.section_key = section_key
 
     def read(self):
+        pos = 0
         while True:
             addr_flags = self.file.read64()
             flags = addr_flags & 0xfff
-            if (flags & (self.STATTR_FLAG_DONE | self.STATTR_FLAG_EOS)):
+
+            if flags & self.STATTR_FLAG_DONE:
+                pos = self.file.tell()
+                continue
+            elif flags & self.STATTR_FLAG_EOS:
                 return
+            else:
+                # No EOS came after DONE, that's OK, but rewind the
+                # stream because this is not our data.
+                if pos:
+                    self.file.seek(pos, os.SEEK_SET)
+                    return
+                raise Exception("Unknown flags %x", flags)
+
             if (flags & self.STATTR_FLAG_ERROR):
                 raise Exception("Error in migration stream")
             count = self.file.read64()
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 4/7] migration: Rename vmstate_info_nullptr
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
                   ` (2 preceding siblings ...)
  2025-01-09 18:52 ` [PATCH v3 3/7] migration: Fix parsing of s390 stream Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-09 19:10   ` Peter Xu
  2025-01-09 18:52 ` [PATCH v3 5/7] migration: Dump correct JSON format for nullptr replacement Fabiano Rosas
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

Rename vmstate_info_nullptr from "uint64_t" to "nullptr". This vmstate
actually reads and writes just a byte, so the proper name would be
uint8. However, since this is a marker for a NULL pointer, it's
convenient to have a more explicit name that can be identified by the
consumers of the JSON part of the stream.

Change the name to "nullptr" and add support for it in the
analyze-migration.py script. Arbitrarily use the name of the type as
the value of the field to avoid the script showing 0x30 or '0', which
could be confusing for readers.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/vmstate-types.c    |  2 +-
 scripts/analyze-migration.py | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
index e83bfccb9e..d70d573dbd 100644
--- a/migration/vmstate-types.c
+++ b/migration/vmstate-types.c
@@ -338,7 +338,7 @@ static int put_nullptr(QEMUFile *f, void *pv, size_t size,
 }
 
 const VMStateInfo vmstate_info_nullptr = {
-    .name = "uint64",
+    .name = "nullptr",
     .get  = get_nullptr,
     .put  = put_nullptr,
 };
diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index fcda11f31d..923f174f1b 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -417,6 +417,28 @@ def __init__(self, desc, file):
         super(VMSDFieldIntLE, self).__init__(desc, file)
         self.dtype = '<i%d' % self.size
 
+class VMSDFieldNull(VMSDFieldGeneric):
+    NULL_PTR_MARKER = b'0'
+
+    def __init__(self, desc, file):
+        super(VMSDFieldNull, self).__init__(desc, file)
+
+    def __repr__(self):
+        # A NULL pointer is encoded in the stream as a '0' to
+        # disambiguate from a mere 0x0 value and avoid consumers
+        # trying to follow the NULL pointer. Displaying '0', 0x30 or
+        # 0x0 when analyzing the JSON debug stream could become
+        # confusing, so use an explicit term instead.
+        return "nullptr"
+
+    def __str__(self):
+        return self.__repr__()
+
+    def read(self):
+        super(VMSDFieldNull, self).read()
+        assert(self.data == self.NULL_PTR_MARKER)
+        return self.data
+
 class VMSDFieldBool(VMSDFieldGeneric):
     def __init__(self, desc, file):
         super(VMSDFieldBool, self).__init__(desc, file)
@@ -558,6 +580,7 @@ def getDict(self):
     "bitmap" : VMSDFieldGeneric,
     "struct" : VMSDFieldStruct,
     "capability": VMSDFieldCap,
+    "nullptr": VMSDFieldNull,
     "unknown" : VMSDFieldGeneric,
 }
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 5/7] migration: Dump correct JSON format for nullptr replacement
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
                   ` (3 preceding siblings ...)
  2025-01-09 18:52 ` [PATCH v3 4/7] migration: Rename vmstate_info_nullptr Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer Fabiano Rosas
  2025-01-09 18:52 ` [PATCH v3 7/7] s390x: Fix CSS migration Fabiano Rosas
  6 siblings, 0 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

From: Peter Xu <peterx@redhat.com>

QEMU plays a trick with null pointers inside an array of pointers in a VMSD
field.  See 07d4e69147 ("migration/vmstate: fix array of ptr with
nullptrs") for more details on why.  The idea makes sense in general, but
it may overlooked the JSON writer where it could write nothing in a
"struct" in the JSON hints section.

We hit some analyze-migration.py issues on s390 recently, showing that some
of the struct field contains nothing, like:

{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}

As described in details by Fabiano:

https://lore.kernel.org/r/87pll37cin.fsf@suse.de

It could be that we hit some null pointers there, and JSON was gone when
they're null pointers.

To fix it, instead of hacking around only at VMStateInfo level, do that
from VMStateField level, so that JSON writer can also be involved.  In this
case, JSON writer will replace the pointer array (which used to be a
"struct") to be the real representation of the nullptr field.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/vmstate.c | 118 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 91 insertions(+), 27 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index aa2821dec6..52704c822c 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -51,6 +51,36 @@ vmstate_field_exists(const VMStateDescription *vmsd, const VMStateField *field,
     return result;
 }
 
+/*
+ * Create a fake nullptr field when there's a NULL pointer detected in the
+ * array of a VMS_ARRAY_OF_POINTER VMSD field.  It's needed because we
+ * can't dereference the NULL pointer.
+ */
+static const VMStateField *
+vmsd_create_fake_nullptr_field(const VMStateField *field)
+{
+    VMStateField *fake = g_new0(VMStateField, 1);
+
+    /* It can only happen on an array of pointers! */
+    assert(field->flags & VMS_ARRAY_OF_POINTER);
+
+    /* Some of fake's properties should match the original's */
+    fake->name = field->name;
+    fake->version_id = field->version_id;
+
+    /* Do not need "field_exists" check as it always exists (which is null) */
+    fake->field_exists = NULL;
+
+    /* See vmstate_info_nullptr - use 1 byte to represent nullptr */
+    fake->size = 1;
+    fake->info = &vmstate_info_nullptr;
+    fake->flags = VMS_SINGLE;
+
+    /* All the rest fields shouldn't matter.. */
+
+    return (const VMStateField *)fake;
+}
+
 static int vmstate_n_elems(void *opaque, const VMStateField *field)
 {
     int n_elems = 1;
@@ -143,23 +173,39 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
             }
             for (i = 0; i < n_elems; i++) {
                 void *curr_elem = first_elem + size * i;
+                const VMStateField *inner_field;
 
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
                     curr_elem = *(void **)curr_elem;
                 }
+
                 if (!curr_elem && size) {
-                    /* if null pointer check placeholder and do not follow */
-                    assert(field->flags & VMS_ARRAY_OF_POINTER);
-                    ret = vmstate_info_nullptr.get(f, curr_elem, size, NULL);
-                } else if (field->flags & VMS_STRUCT) {
-                    ret = vmstate_load_state(f, field->vmsd, curr_elem,
-                                             field->vmsd->version_id);
-                } else if (field->flags & VMS_VSTRUCT) {
-                    ret = vmstate_load_state(f, field->vmsd, curr_elem,
-                                             field->struct_version_id);
+                    /*
+                     * If null pointer found (which should only happen in
+                     * an array of pointers), use null placeholder and do
+                     * not follow.
+                     */
+                    inner_field = vmsd_create_fake_nullptr_field(field);
                 } else {
-                    ret = field->info->get(f, curr_elem, size, field);
+                    inner_field = field;
                 }
+
+                if (inner_field->flags & VMS_STRUCT) {
+                    ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
+                                             inner_field->vmsd->version_id);
+                } else if (inner_field->flags & VMS_VSTRUCT) {
+                    ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
+                                             inner_field->struct_version_id);
+                } else {
+                    ret = inner_field->info->get(f, curr_elem, size,
+                                                 inner_field);
+                }
+
+                /* If we used a fake temp field.. free it now */
+                if (inner_field != field) {
+                    g_clear_pointer((gpointer *)&inner_field, g_free);
+                }
+
                 if (ret >= 0) {
                     ret = qemu_file_get_error(f);
                 }
@@ -387,29 +433,50 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
             }
             for (i = 0; i < n_elems; i++) {
                 void *curr_elem = first_elem + size * i;
+                const VMStateField *inner_field;
 
-                vmsd_desc_field_start(vmsd, vmdesc_loop, field, i, n_elems);
                 old_offset = qemu_file_transferred(f);
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
                     assert(curr_elem);
                     curr_elem = *(void **)curr_elem;
                 }
+
                 if (!curr_elem && size) {
-                    /* if null pointer write placeholder and do not follow */
-                    assert(field->flags & VMS_ARRAY_OF_POINTER);
-                    ret = vmstate_info_nullptr.put(f, curr_elem, size, NULL,
-                                                   NULL);
-                } else if (field->flags & VMS_STRUCT) {
-                    ret = vmstate_save_state(f, field->vmsd, curr_elem,
-                                             vmdesc_loop);
-                } else if (field->flags & VMS_VSTRUCT) {
-                    ret = vmstate_save_state_v(f, field->vmsd, curr_elem,
-                                               vmdesc_loop,
-                                               field->struct_version_id, errp);
+                    /*
+                     * If null pointer found (which should only happen in
+                     * an array of pointers), use null placeholder and do
+                     * not follow.
+                     */
+                    inner_field = vmsd_create_fake_nullptr_field(field);
                 } else {
-                    ret = field->info->put(f, curr_elem, size, field,
-                                     vmdesc_loop);
+                    inner_field = field;
                 }
+
+                vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field,
+                                      i, n_elems);
+
+                if (inner_field->flags & VMS_STRUCT) {
+                    ret = vmstate_save_state(f, inner_field->vmsd,
+                                             curr_elem, vmdesc_loop);
+                } else if (inner_field->flags & VMS_VSTRUCT) {
+                    ret = vmstate_save_state_v(f, inner_field->vmsd,
+                                               curr_elem, vmdesc_loop,
+                                               inner_field->struct_version_id,
+                                               errp);
+                } else {
+                    ret = inner_field->info->put(f, curr_elem, size,
+                                                 inner_field, vmdesc_loop);
+                }
+
+                written_bytes = qemu_file_transferred(f) - old_offset;
+                vmsd_desc_field_end(vmsd, vmdesc_loop, inner_field,
+                                    written_bytes);
+
+                /* If we used a fake temp field.. free it now */
+                if (inner_field != field) {
+                    g_clear_pointer((gpointer *)&inner_field, g_free);
+                }
+
                 if (ret) {
                     error_setg(errp, "Save of field %s/%s failed",
                                 vmsd->name, field->name);
@@ -419,9 +486,6 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                     return ret;
                 }
 
-                written_bytes = qemu_file_transferred(f) - old_offset;
-                vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes);
-
                 /* Compressed arrays only care about the first element */
                 if (vmdesc_loop && vmsd_can_compress(field)) {
                     vmdesc_loop = NULL;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
                   ` (4 preceding siblings ...)
  2025-01-09 18:52 ` [PATCH v3 5/7] migration: Dump correct JSON format for nullptr replacement Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-09 19:10   ` Peter Xu
  2025-01-09 18:52 ` [PATCH v3 7/7] s390x: Fix CSS migration Fabiano Rosas
  6 siblings, 1 reply; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth

Currently, if an array of pointers contains a NULL pointer, that
pointer will be encoded as '0' in the stream. Since the JSON writer
doesn't define a "pointer" type, that '0' will now be an uint8, which
is different from the original type being pointed to, e.g. struct.

(we're further calling uint8 "nullptr", but that's irrelevant to the
issue)

That mixed-type array shouldn't be compressed, otherwise data is lost
as the code currently makes the whole array have the type of the first
element:

css = {NULL, NULL, ..., 0x5555568a7940, NULL};

{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
 "version": 1, "fields": [
    ...,
    {"name": "css", "array_len": 256, "type": "nullptr", "size": 1},
    ...,
]}

In the above, the valid pointer at position 254 got lost among the
compressed array of nullptr.

While we could disable the array compression when a NULL pointer is
found, the JSON part of the stream still makes part of downtime, so we
should avoid writing unecessary bytes to it.

Keep the array compression in place, but if NULL and non-NULL pointers
are mixed break the array into several type-contiguous pieces :

css = {NULL, NULL, ..., 0x5555568a7940, NULL};

{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
 "version": 1, "fields": [
     ...,
     {"name": "css", "array_len": 254, "type": "nullptr", "size": 1},
     {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768},
     {"name": "css", "type": "nullptr", "size": 1},
     ...,
]}

Now each type-discontiguous region will become a new JSON entry. The
reader should interpret this as a concatenation of values, all part of
the same field.

Parsing the JSON with analyze-script.py now shows the proper data
being pointed to at the places where the pointer is valid and
"nullptr" where there's NULL:

"s390_css (14)": {
    ...
    "css": [
        "nullptr",
        "nullptr",
        ...
        "nullptr",
        {
            "chpids": [
            {
                "in_use": "0x00",
                "type": "0x00",
                "is_virtual": "0x00"
            },
            ...
            ]
        },
        "nullptr",
    }

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/vmstate.c          | 33 ++++++++++++++++++++++++++++++++-
 scripts/analyze-migration.py | 26 ++++++++++++++++++--------
 2 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index 52704c822c..82bd005a83 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
             int size = vmstate_size(opaque, field);
             uint64_t old_offset, written_bytes;
             JSONWriter *vmdesc_loop = vmdesc;
+            bool is_prev_null = false;
 
             trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems);
             if (field->flags & VMS_POINTER) {
                 first_elem = *(void **)first_elem;
                 assert(first_elem || !n_elems || !size);
             }
+
             for (i = 0; i < n_elems; i++) {
                 void *curr_elem = first_elem + size * i;
                 const VMStateField *inner_field;
+                bool is_null;
+                int max_elems = n_elems - i;
 
                 old_offset = qemu_file_transferred(f);
                 if (field->flags & VMS_ARRAY_OF_POINTER) {
@@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
                      * not follow.
                      */
                     inner_field = vmsd_create_fake_nullptr_field(field);
+                    is_null = true;
                 } else {
                     inner_field = field;
+                    is_null = false;
+                }
+
+                /*
+                 * Due to the fake nullptr handling above, if there's mixed
+                 * null/non-null data, it doesn't make sense to emit a
+                 * compressed array representation spanning the entire array
+                 * because the field types will be different (e.g. struct
+                 * vs. nullptr). Search ahead for the next null/non-null element
+                 * and start a new compressed array if found.
+                 */
+                if (field->flags & VMS_ARRAY_OF_POINTER &&
+                    is_null != is_prev_null) {
+
+                    is_prev_null = is_null;
+                    vmdesc_loop = vmdesc;
+
+                    for (int j = i + 1; j < n_elems; j++) {
+                        void *elem = *(void **)(first_elem + size * j);
+                        bool elem_is_null = !elem && size;
+
+                        if (is_null != elem_is_null) {
+                            max_elems = j - i;
+                            break;
+                        }
+                    }
                 }
 
                 vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field,
-                                      i, n_elems);
+                                      i, max_elems);
 
                 if (inner_field->flags & VMS_STRUCT) {
                     ret = vmstate_save_state(f, inner_field->vmsd,
diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py
index 923f174f1b..8e1fbf4c9d 100755
--- a/scripts/analyze-migration.py
+++ b/scripts/analyze-migration.py
@@ -502,15 +502,25 @@ def read(self):
             field['data'] = reader(field, self.file)
             field['data'].read()
 
-            if 'index' in field:
-                if field['name'] not in self.data:
-                    self.data[field['name']] = []
-                a = self.data[field['name']]
-                if len(a) != int(field['index']):
-                    raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index'])))
-                a.append(field['data'])
+            fname = field['name']
+            fdata = field['data']
+
+            # The field could be:
+            # i) a single data entry, e.g. uint64
+            # ii) an array, indicated by it containing the 'index' key
+            #
+            # However, the overall data after parsing the whole
+            # stream, could be a mix of arrays and single data fields,
+            # all sharing the same field name due to how QEMU breaks
+            # up arrays with NULL pointers into multiple compressed
+            # array segments.
+            if fname not in self.data:
+                self.data[fname] = fdata
+            elif type(self.data[fname]) == list:
+                self.data[fname].append(fdata)
             else:
-                self.data[field['name']] = field['data']
+                tmp = self.data[fname]
+                self.data[fname] = [tmp, fdata]
 
         if 'subsections' in self.desc['struct']:
             for subsection in self.desc['struct']['subsections']:
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 7/7] s390x: Fix CSS migration
  2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
                   ` (5 preceding siblings ...)
  2025-01-09 18:52 ` [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer Fabiano Rosas
@ 2025-01-09 18:52 ` Fabiano Rosas
  2025-01-12 14:34   ` Michael Tokarev
  2025-01-12 14:34   ` Michael Tokarev
  6 siblings, 2 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-09 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Thomas Huth, Paolo Bonzini, qemu-stable

Commit a55ae46683 ("s390: move css_migration_enabled from machine to
css.c") disabled CSS migration globally instead of doing it
per-instance.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: qemu-stable@nongnu.org #9.1
Fixes: a55ae46683 ("s390: move css_migration_enabled from machine to css.c")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2704
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241213160120.23880-3-farosas@suse.de>
---
 hw/s390x/s390-virtio-ccw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 8a242cc1ec..38aeba14ee 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1244,6 +1244,7 @@ static void ccw_machine_2_9_instance_options(MachineState *machine)
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ZPCI);
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ADAPTER_INT_SUPPRESSION);
     s390_cpudef_featoff_greater(12, 1, S390_FEAT_ADAPTER_EVENT_NOTIFICATION);
+    css_migration_enabled = false;
 }
 
 static void ccw_machine_2_9_class_options(MachineClass *mc)
@@ -1256,7 +1257,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
     ccw_machine_2_10_class_options(mc);
     compat_props_add(mc->compat_props, hw_compat_2_9, hw_compat_2_9_len);
     compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
-    css_migration_enabled = false;
 }
 DEFINE_CCW_MACHINE(2, 9);
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 4/7] migration: Rename vmstate_info_nullptr
  2025-01-09 18:52 ` [PATCH v3 4/7] migration: Rename vmstate_info_nullptr Fabiano Rosas
@ 2025-01-09 19:10   ` Peter Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Xu @ 2025-01-09 19:10 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Thomas Huth

On Thu, Jan 09, 2025 at 03:52:46PM -0300, Fabiano Rosas wrote:
> Rename vmstate_info_nullptr from "uint64_t" to "nullptr". This vmstate
> actually reads and writes just a byte, so the proper name would be
> uint8. However, since this is a marker for a NULL pointer, it's
> convenient to have a more explicit name that can be identified by the
> consumers of the JSON part of the stream.
> 
> Change the name to "nullptr" and add support for it in the
> analyze-migration.py script. Arbitrarily use the name of the type as
> the value of the field to avoid the script showing 0x30 or '0', which
> could be confusing for readers.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer
  2025-01-09 18:52 ` [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer Fabiano Rosas
@ 2025-01-09 19:10   ` Peter Xu
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Xu @ 2025-01-09 19:10 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Thomas Huth

On Thu, Jan 09, 2025 at 03:52:48PM -0300, Fabiano Rosas wrote:
> Currently, if an array of pointers contains a NULL pointer, that
> pointer will be encoded as '0' in the stream. Since the JSON writer
> doesn't define a "pointer" type, that '0' will now be an uint8, which
> is different from the original type being pointed to, e.g. struct.
> 
> (we're further calling uint8 "nullptr", but that's irrelevant to the
> issue)
> 
> That mixed-type array shouldn't be compressed, otherwise data is lost
> as the code currently makes the whole array have the type of the first
> element:
> 
> css = {NULL, NULL, ..., 0x5555568a7940, NULL};
> 
> {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
>  "version": 1, "fields": [
>     ...,
>     {"name": "css", "array_len": 256, "type": "nullptr", "size": 1},
>     ...,
> ]}
> 
> In the above, the valid pointer at position 254 got lost among the
> compressed array of nullptr.
> 
> While we could disable the array compression when a NULL pointer is
> found, the JSON part of the stream still makes part of downtime, so we
> should avoid writing unecessary bytes to it.
> 
> Keep the array compression in place, but if NULL and non-NULL pointers
> are mixed break the array into several type-contiguous pieces :
> 
> css = {NULL, NULL, ..., 0x5555568a7940, NULL};
> 
> {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
>  "version": 1, "fields": [
>      ...,
>      {"name": "css", "array_len": 254, "type": "nullptr", "size": 1},
>      {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768},
>      {"name": "css", "type": "nullptr", "size": 1},
>      ...,
> ]}
> 
> Now each type-discontiguous region will become a new JSON entry. The
> reader should interpret this as a concatenation of values, all part of
> the same field.
> 
> Parsing the JSON with analyze-script.py now shows the proper data
> being pointed to at the places where the pointer is valid and
> "nullptr" where there's NULL:
> 
> "s390_css (14)": {
>     ...
>     "css": [
>         "nullptr",
>         "nullptr",
>         ...
>         "nullptr",
>         {
>             "chpids": [
>             {
>                 "in_use": "0x00",
>                 "type": "0x00",
>                 "is_virtual": "0x00"
>             },
>             ...
>             ]
>         },
>         "nullptr",
>     }
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-09 18:52 ` [PATCH v3 3/7] migration: Fix parsing of s390 stream Fabiano Rosas
@ 2025-01-12 13:06   ` Michael Tokarev
  2025-01-12 14:29     ` Michael Tokarev
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Tokarev @ 2025-01-12 13:06 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel; +Cc: Peter Xu, Thomas Huth, qemu-stable

09.01.2025 21:52, Fabiano Rosas wrote:
> The parsing for the S390StorageAttributes section is currently leaving
> an unconsumed token that is later interpreted by the generic code as
> QEMU_VM_EOF, cutting the parsing short.
> 
> The migration will issue a STATTR_FLAG_DONE between iterations, which
> the script consumes correctly, but there's a final STATTR_FLAG_EOS at
> .save_complete that the script is ignoring. Since the EOS flag is a
> u64 0x1ULL and the stream is big endian, on little endian hosts a byte
> read from it will be 0x0, the same as QEMU_VM_EOF.
> 
> Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

This looks like a qemu-stable material (if not only for tests), is it not?

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-12 13:06   ` Michael Tokarev
@ 2025-01-12 14:29     ` Michael Tokarev
  2025-01-13  6:39       ` Thomas Huth
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Tokarev @ 2025-01-12 14:29 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel; +Cc: Peter Xu, Thomas Huth, qemu-stable

12.01.2025 16:06, Michael Tokarev wrote:
> 09.01.2025 21:52, Fabiano Rosas wrote:
>> The parsing for the S390StorageAttributes section is currently leaving
>> an unconsumed token that is later interpreted by the generic code as
>> QEMU_VM_EOF, cutting the parsing short.
>>
>> The migration will issue a STATTR_FLAG_DONE between iterations, which
>> the script consumes correctly, but there's a final STATTR_FLAG_EOS at
>> .save_complete that the script is ignoring. Since the EOS flag is a
>> u64 0x1ULL and the stream is big endian, on little endian hosts a byte
>> read from it will be 0x0, the same as QEMU_VM_EOF.
>>
>> Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> 
> This looks like a qemu-stable material (if not only for tests), is it not?

This one, when applied to 9.2 together with "s390x: Fix CSS migration",
causes s390x-migration-test failure.

First, it goes:

# starting QEMU: exec ./qemu-system-s390x -qtest unix:/tmp/qtest-1137270.sock -qtest-log /dev/null -chardev 
socket,path=/tmp/qtest-1137270.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine 
s390-ccw-virtio-9.2, -name target,debug-threads=on -m 128M -serial file:/tmp/migration-test-T987Z2/dest_serial -incoming tcp:127.0.0.1:0 -bios 
/tmp/migration-test-T987Z2/bootsect     -accel qtest
Traceback (most recent call last):
   File "/tmp/q/scripts/analyze-migration.py", line 704, in <module>
     dump.read(dump_memory = args.memory)
   File "/tmp/q/scripts/analyze-migration.py", line 641, in read
     section.read()
   File "/tmp/q/scripts/analyze-migration.py", line 477, in read
     field['data'] = reader(field, self.file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^
   File "/tmp/q/scripts/analyze-migration.py", line 450, in __init__
     for field in self.desc['struct']['fields']:
                  ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'fields'
# Failed to analyze the migration stream

and finally

**
ERROR:../../build/qemu/9.2/tests/qtest/migration-test.c:4039:main: assertion failed (ret == 0): (1 == 0)
Bail out! ERROR:../../build/qemu/9.2/tests/qtest/migration-test.c:4039:main: assertion failed (ret == 0): (1 == 0)

It doesn't happen when both these patches are applied to 9.1 though -
there, the test succeeds.

Hmm..

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 7/7] s390x: Fix CSS migration
  2025-01-09 18:52 ` [PATCH v3 7/7] s390x: Fix CSS migration Fabiano Rosas
@ 2025-01-12 14:34   ` Michael Tokarev
  2025-01-13  6:37     ` Thomas Huth
  2025-01-12 14:34   ` Michael Tokarev
  1 sibling, 1 reply; 22+ messages in thread
From: Michael Tokarev @ 2025-01-12 14:34 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Peter Xu, Thomas Huth, Paolo Bonzini, qemu-stable

09.01.2025 21:52, Fabiano Rosas wrote:
> Commit a55ae46683 ("s390: move css_migration_enabled from machine to
> css.c") disabled CSS migration globally instead of doing it
> per-instance.
> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: qemu-stable@nongnu.org #9.1
> Fixes: a55ae46683 ("s390: move css_migration_enabled from machine to css.c")
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2704
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> Message-Id: <20241213160120.23880-3-farosas@suse.de>

And this one causes s390x-migration-test failure on 9.2 on s390x.
While this test succeeds on x86_64.

https://gitlab.com/qemu-project/qemu/-/jobs/8829799108

Help? :)

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 7/7] s390x: Fix CSS migration
  2025-01-09 18:52 ` [PATCH v3 7/7] s390x: Fix CSS migration Fabiano Rosas
  2025-01-12 14:34   ` Michael Tokarev
@ 2025-01-12 14:34   ` Michael Tokarev
  1 sibling, 0 replies; 22+ messages in thread
From: Michael Tokarev @ 2025-01-12 14:34 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Peter Xu, Thomas Huth, Paolo Bonzini, qemu-stable

09.01.2025 21:52, Fabiano Rosas wrote:
> Commit a55ae46683 ("s390: move css_migration_enabled from machine to
> css.c") disabled CSS migration globally instead of doing it
> per-instance.
> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: qemu-stable@nongnu.org #9.1

Or should it be applied to 9.1 ONLY, but not to 9.2?

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 7/7] s390x: Fix CSS migration
  2025-01-12 14:34   ` Michael Tokarev
@ 2025-01-13  6:37     ` Thomas Huth
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Huth @ 2025-01-13  6:37 UTC (permalink / raw)
  To: Michael Tokarev, Fabiano Rosas, qemu-devel
  Cc: Peter Xu, Paolo Bonzini, qemu-stable

On 12/01/2025 15.34, Michael Tokarev wrote:
> 09.01.2025 21:52, Fabiano Rosas wrote:
>> Commit a55ae46683 ("s390: move css_migration_enabled from machine to
>> css.c") disabled CSS migration globally instead of doing it
>> per-instance.
>>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: qemu-stable@nongnu.org #9.1
>> Fixes: a55ae46683 ("s390: move css_migration_enabled from machine to css.c")
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2704
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> Message-Id: <20241213160120.23880-3-farosas@suse.de>
> 
> And this one causes s390x-migration-test failure on 9.2 on s390x.
> While this test succeeds on x86_64.
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/8829799108
> 
> Help? :)

That looks like the script error that should be fixed by patch 3/7, I think? 
You likely need that patch, too...

  Thomas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-12 14:29     ` Michael Tokarev
@ 2025-01-13  6:39       ` Thomas Huth
  2025-01-13  7:51         ` Michael Tokarev
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Huth @ 2025-01-13  6:39 UTC (permalink / raw)
  To: Michael Tokarev, Fabiano Rosas, qemu-devel; +Cc: Peter Xu, qemu-stable

On 12/01/2025 15.29, Michael Tokarev wrote:
> 12.01.2025 16:06, Michael Tokarev wrote:
>> 09.01.2025 21:52, Fabiano Rosas wrote:
>>> The parsing for the S390StorageAttributes section is currently leaving
>>> an unconsumed token that is later interpreted by the generic code as
>>> QEMU_VM_EOF, cutting the parsing short.
>>>
>>> The migration will issue a STATTR_FLAG_DONE between iterations, which
>>> the script consumes correctly, but there's a final STATTR_FLAG_EOS at
>>> .save_complete that the script is ignoring. Since the EOS flag is a
>>> u64 0x1ULL and the stream is big endian, on little endian hosts a byte
>>> read from it will be 0x0, the same as QEMU_VM_EOF.
>>>
>>> Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py 
>>> for s390x")
>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>
>> This looks like a qemu-stable material (if not only for tests), is it not?
> 
> This one, when applied to 9.2 together with "s390x: Fix CSS migration",
> causes s390x-migration-test failure.
> 
> First, it goes:
> 
> # starting QEMU: exec ./qemu-system-s390x -qtest unix:/tmp/ 
> qtest-1137270.sock -qtest-log /dev/null -chardev socket,path=/tmp/ 
> qtest-1137270.qmp,id=char0 -mon chardev=char0,mode=control -display none - 
> audio none -accel kvm -accel tcg -machine s390-ccw-virtio-9.2, -name 
> target,debug-threads=on -m 128M -serial file:/tmp/migration-test-T987Z2/ 
> dest_serial -incoming tcp:127.0.0.1:0 -bios /tmp/migration-test-T987Z2/ 
> bootsect     -accel qtest
> Traceback (most recent call last):
>    File "/tmp/q/scripts/analyze-migration.py", line 704, in <module>
>      dump.read(dump_memory = args.memory)
>    File "/tmp/q/scripts/analyze-migration.py", line 641, in read
>      section.read()
>    File "/tmp/q/scripts/analyze-migration.py", line 477, in read
>      field['data'] = reader(field, self.file)
>                      ^^^^^^^^^^^^^^^^^^^^^^^^
>    File "/tmp/q/scripts/analyze-migration.py", line 450, in __init__
>      for field in self.desc['struct']['fields']:
>                   ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
> KeyError: 'fields'
> # Failed to analyze the migration stream

I think you need to backport patch 1/7, too?

  Thomas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13  6:39       ` Thomas Huth
@ 2025-01-13  7:51         ` Michael Tokarev
  2025-01-13  8:19           ` Thomas Huth
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Tokarev @ 2025-01-13  7:51 UTC (permalink / raw)
  To: Thomas Huth, Fabiano Rosas, qemu-devel; +Cc: Peter Xu, qemu-stable

13.01.2025 09:39, Thomas Huth wrote:
> On 12/01/2025 15.29, Michael Tokarev wrote:

>> # starting QEMU: exec ./qemu-system-s390x -qtest unix:/tmp/ qtest-1137270.sock -qtest-log /dev/null -chardev socket,path=/tmp/ 
>> qtest-1137270.qmp,id=char0 -mon chardev=char0,mode=control -display none - audio none -accel kvm -accel tcg -machine s390-ccw-virtio-9.2, -name 
>> target,debug-threads=on -m 128M -serial file:/tmp/migration-test-T987Z2/ dest_serial -incoming tcp:127.0.0.1:0 -bios /tmp/migration-test-T987Z2/ 
>> bootsect     -accel qtest
>> Traceback (most recent call last):
>>    File "/tmp/q/scripts/analyze-migration.py", line 704, in <module>
>>      dump.read(dump_memory = args.memory)
>>    File "/tmp/q/scripts/analyze-migration.py", line 641, in read
>>      section.read()
>>    File "/tmp/q/scripts/analyze-migration.py", line 477, in read
>>      field['data'] = reader(field, self.file)
>>                      ^^^^^^^^^^^^^^^^^^^^^^^^
>>    File "/tmp/q/scripts/analyze-migration.py", line 450, in __init__
>>      for field in self.desc['struct']['fields']:
>>                   ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
>> KeyError: 'fields'
>> # Failed to analyze the migration stream
> 
> I think you need to backport patch 1/7, too?

Picked up:

  1/7 migration: Add more error handling to analyze-migration.py
  3/7 migration: Fix parsing of s390 stream
  7/7 s390x: Fix CSS migration

but still getting the same error:

  https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
  https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13  7:51         ` Michael Tokarev
@ 2025-01-13  8:19           ` Thomas Huth
  2025-01-13  9:09             ` Michael Tokarev
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Huth @ 2025-01-13  8:19 UTC (permalink / raw)
  To: Michael Tokarev, Fabiano Rosas, qemu-devel; +Cc: Peter Xu, qemu-stable

On 13/01/2025 08.51, Michael Tokarev wrote:
> 13.01.2025 09:39, Thomas Huth wrote:
>> On 12/01/2025 15.29, Michael Tokarev wrote:
> 
>>> # starting QEMU: exec ./qemu-system-s390x -qtest unix:/tmp/ 
>>> qtest-1137270.sock -qtest-log /dev/null -chardev socket,path=/tmp/ 
>>> qtest-1137270.qmp,id=char0 -mon chardev=char0,mode=control -display none 
>>> - audio none -accel kvm -accel tcg -machine s390-ccw-virtio-9.2, -name 
>>> target,debug-threads=on -m 128M -serial file:/tmp/migration-test-T987Z2/ 
>>> dest_serial -incoming tcp:127.0.0.1:0 -bios /tmp/migration-test-T987Z2/ 
>>> bootsect     -accel qtest
>>> Traceback (most recent call last):
>>>    File "/tmp/q/scripts/analyze-migration.py", line 704, in <module>
>>>      dump.read(dump_memory = args.memory)
>>>    File "/tmp/q/scripts/analyze-migration.py", line 641, in read
>>>      section.read()
>>>    File "/tmp/q/scripts/analyze-migration.py", line 477, in read
>>>      field['data'] = reader(field, self.file)
>>>                      ^^^^^^^^^^^^^^^^^^^^^^^^
>>>    File "/tmp/q/scripts/analyze-migration.py", line 450, in __init__
>>>      for field in self.desc['struct']['fields']:
>>>                   ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
>>> KeyError: 'fields'
>>> # Failed to analyze the migration stream
>>
>> I think you need to backport patch 1/7, too?
> 
> Picked up:
> 
>   1/7 migration: Add more error handling to analyze-migration.py
>   3/7 migration: Fix parsing of s390 stream
>   7/7 s390x: Fix CSS migration
> 
> but still getting the same error:
> 
>   https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
>   https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)

Blindly guessing: You need now patch 4/7 and 5/7, too? ... hopefully Fabiano 
could reply and point you to the exact set of patches that you need...

  Thomas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13  8:19           ` Thomas Huth
@ 2025-01-13  9:09             ` Michael Tokarev
  2025-01-13 13:03               ` Fabiano Rosas
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Tokarev @ 2025-01-13  9:09 UTC (permalink / raw)
  To: Thomas Huth, Fabiano Rosas, qemu-devel; +Cc: Peter Xu, qemu-stable

13.01.2025 11:19, Thomas Huth wrote:
> On 13/01/2025 08.51, Michael Tokarev wrote:

>> Picked up:
>>
>>   1/7 migration: Add more error handling to analyze-migration.py
>>   3/7 migration: Fix parsing of s390 stream
>>   7/7 s390x: Fix CSS migration
>>
>> but still getting the same error:
>>
>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)
> 
> Blindly guessing: You need now patch 4/7 and 5/7, too? ... hopefully Fabiano could reply and point you to the exact set of patches that you need...

Yes, after picking up ALL 7 out of 7 in this patchset (b/c even the
trivial 2/7, "migration: Remove unused argument in vmsd_desc_field_end",
is needed or else the subsequent fixes doesn't apply) it is now green
finally, eg, https://gitlab.com/mjt0k/qemu/-/jobs/8832849813

Fabiano, what do you think, - should the whole patchset be picked up
for 9.2 and 9.1?

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13  9:09             ` Michael Tokarev
@ 2025-01-13 13:03               ` Fabiano Rosas
  2025-01-13 13:07                 ` Fabiano Rosas
  2025-01-13 15:13                 ` Michael Tokarev
  0 siblings, 2 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-13 13:03 UTC (permalink / raw)
  To: Michael Tokarev, Thomas Huth, qemu-devel; +Cc: Peter Xu, qemu-stable

Michael Tokarev <mjt@tls.msk.ru> writes:

> 13.01.2025 11:19, Thomas Huth wrote:
>> On 13/01/2025 08.51, Michael Tokarev wrote:
>
>>> Picked up:
>>>
>>>   1/7 migration: Add more error handling to analyze-migration.py
>>>   3/7 migration: Fix parsing of s390 stream
>>>   7/7 s390x: Fix CSS migration
>>>
>>> but still getting the same error:
>>>
>>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
>>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)
>> 
>> Blindly guessing: You need now patch 4/7 and 5/7, too? ... hopefully Fabiano could reply and point you to the exact set of patches that you need...
>
> Yes, after picking up ALL 7 out of 7 in this patchset (b/c even the
> trivial 2/7, "migration: Remove unused argument in vmsd_desc_field_end",
> is needed or else the subsequent fixes doesn't apply) it is now green
> finally, eg, https://gitlab.com/mjt0k/qemu/-/jobs/8832849813
>
> Fabiano, what do you think, - should the whole patchset be picked up
> for 9.2 and 9.1?

Yeah, sorry, I was focused on unbreaking the migration PR and added a
bunch of patches without thinking of stable.

So the s390x regression (1/7) is 9.1, but 9.0 already had the broken
analyze-script.py (3/7) and the broken array compression code (4-6/7).

We definitely need 1/7 for 9.1 and 9.2. The rest of the series is "just"
to avoid breaking the tests. If you can apply it easily I think that's
preferable. Otherwise maybe we could disable the analyze-migration.py
test for stable? I can also work on a backport if needed. Let me know
what you prefer.

>
> Thanks,
>
> /mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13 13:03               ` Fabiano Rosas
@ 2025-01-13 13:07                 ` Fabiano Rosas
  2025-01-13 15:13                 ` Michael Tokarev
  1 sibling, 0 replies; 22+ messages in thread
From: Fabiano Rosas @ 2025-01-13 13:07 UTC (permalink / raw)
  To: Michael Tokarev, Thomas Huth, qemu-devel; +Cc: Peter Xu, qemu-stable

Fabiano Rosas <farosas@suse.de> writes:

> Michael Tokarev <mjt@tls.msk.ru> writes:
>
>> 13.01.2025 11:19, Thomas Huth wrote:
>>> On 13/01/2025 08.51, Michael Tokarev wrote:
>>
>>>> Picked up:
>>>>
>>>>   1/7 migration: Add more error handling to analyze-migration.py
>>>>   3/7 migration: Fix parsing of s390 stream
>>>>   7/7 s390x: Fix CSS migration
>>>>
>>>> but still getting the same error:
>>>>
>>>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
>>>>   https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)
>>> 
>>> Blindly guessing: You need now patch 4/7 and 5/7, too? ... hopefully Fabiano could reply and point you to the exact set of patches that you need...
>>
>> Yes, after picking up ALL 7 out of 7 in this patchset (b/c even the
>> trivial 2/7, "migration: Remove unused argument in vmsd_desc_field_end",
>> is needed or else the subsequent fixes doesn't apply) it is now green
>> finally, eg, https://gitlab.com/mjt0k/qemu/-/jobs/8832849813
>>
>> Fabiano, what do you think, - should the whole patchset be picked up
>> for 9.2 and 9.1?
>
> Yeah, sorry, I was focused on unbreaking the migration PR and added a
> bunch of patches without thinking of stable.
>
> So the s390x regression (1/7) is 9.1, but 9.0 already had the broken

7/7

> analyze-script.py (3/7) and the broken array compression code (4-6/7).
>
> We definitely need 1/7 for 9.1 and 9.2. The rest of the series is "just"

7/7

> to avoid breaking the tests. If you can apply it easily I think that's
> preferable. Otherwise maybe we could disable the analyze-migration.py
> test for stable? I can also work on a backport if needed. Let me know
> what you prefer.
>
>>
>> Thanks,
>>
>> /mjt


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/7] migration: Fix parsing of s390 stream
  2025-01-13 13:03               ` Fabiano Rosas
  2025-01-13 13:07                 ` Fabiano Rosas
@ 2025-01-13 15:13                 ` Michael Tokarev
  1 sibling, 0 replies; 22+ messages in thread
From: Michael Tokarev @ 2025-01-13 15:13 UTC (permalink / raw)
  To: Fabiano Rosas, Thomas Huth, qemu-devel; +Cc: Peter Xu, qemu-stable

13.01.2025 16:03, Fabiano Rosas wrote:
> Michael Tokarev <mjt@tls.msk.ru> writes:
> 
>> 13.01.2025 11:19, Thomas Huth wrote:
>>> On 13/01/2025 08.51, Michael Tokarev wrote:
>>
>>>> Picked up:
>>>>
>>>>    1/7 migration: Add more error handling to analyze-migration.py
>>>>    3/7 migration: Fix parsing of s390 stream
>>>>    7/7 s390x: Fix CSS migration
>>>>
>>>> but still getting the same error:
>>>>
>>>>    https://gitlab.com/mjt0k/qemu/-/jobs/8832218999 (9.2 branch)
>>>>    https://gitlab.com/mjt0k/qemu/-/jobs/8832224338 (9.1 branch)
>>>
>>> Blindly guessing: You need now patch 4/7 and 5/7, too? ... hopefully Fabiano could reply and point you to the exact set of patches that you need...
>>
>> Yes, after picking up ALL 7 out of 7 in this patchset (b/c even the
>> trivial 2/7, "migration: Remove unused argument in vmsd_desc_field_end",
>> is needed or else the subsequent fixes doesn't apply) it is now green
>> finally, eg, https://gitlab.com/mjt0k/qemu/-/jobs/8832849813
>>
>> Fabiano, what do you think, - should the whole patchset be picked up
>> for 9.2 and 9.1?
> 
> Yeah, sorry, I was focused on unbreaking the migration PR and added a
> bunch of patches without thinking of stable.

That's okay, it's already good (and large) enough to think about current
master.

> So the s390x regression (7/7) is 9.1, but 9.0 already had the broken
> analyze-script.py (3/7) and the broken array compression code (4-6/7).
> 
> We definitely need 7/7 for 9.1 and 9.2. The rest of the series is "just"
> to avoid breaking the tests. If you can apply it easily I think that's
> preferable. Otherwise maybe we could disable the analyze-migration.py
> test for stable? I can also work on a backport if needed. Let me know
> what you prefer.

I applied all 7 to both 9.1 and 9.2.  9.0 is end-of-life at this point,
so everything should be ok.  Both 9.1 and 9.2 passes the tests with this
whole series applied.

Thank you!

/mjt

>>
>> Thanks,
>>
>> /mjt



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-01-13 15:14 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 18:52 [PATCH v3 0/7] migration: Fix s390 regressions + migration script Fabiano Rosas
2025-01-09 18:52 ` [PATCH v3 1/7] migration: Add more error handling to analyze-migration.py Fabiano Rosas
2025-01-09 18:52 ` [PATCH v3 2/7] migration: Remove unused argument in vmsd_desc_field_end Fabiano Rosas
2025-01-09 18:52 ` [PATCH v3 3/7] migration: Fix parsing of s390 stream Fabiano Rosas
2025-01-12 13:06   ` Michael Tokarev
2025-01-12 14:29     ` Michael Tokarev
2025-01-13  6:39       ` Thomas Huth
2025-01-13  7:51         ` Michael Tokarev
2025-01-13  8:19           ` Thomas Huth
2025-01-13  9:09             ` Michael Tokarev
2025-01-13 13:03               ` Fabiano Rosas
2025-01-13 13:07                 ` Fabiano Rosas
2025-01-13 15:13                 ` Michael Tokarev
2025-01-09 18:52 ` [PATCH v3 4/7] migration: Rename vmstate_info_nullptr Fabiano Rosas
2025-01-09 19:10   ` Peter Xu
2025-01-09 18:52 ` [PATCH v3 5/7] migration: Dump correct JSON format for nullptr replacement Fabiano Rosas
2025-01-09 18:52 ` [PATCH v3 6/7] migration: Fix arrays of pointers in JSON writer Fabiano Rosas
2025-01-09 19:10   ` Peter Xu
2025-01-09 18:52 ` [PATCH v3 7/7] s390x: Fix CSS migration Fabiano Rosas
2025-01-12 14:34   ` Michael Tokarev
2025-01-13  6:37     ` Thomas Huth
2025-01-12 14:34   ` Michael Tokarev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.