public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch
@ 2026-03-04 10:15 Eric Auger
  2026-03-04 10:15 ` [PATCH v3 1/7] vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC Eric Auger
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

This series comes as a follow-up of discussions held in
[PATCH v6 00/11] Mitigation of "failed to load
cpu:cpreg_vmstate_array_len" migration failures
(https://lore.kernel.org/all/20260126165445.3033335-1-eric.auger@redhat.com/)

It only covers the improvement of the traces. Actual mitigations
are handled in a follow-up series:
Mitigation of "failed to load cpu:cpreg_vmstate_array_len" (v8)

When the number of CPU registers received in an incoming migration
stream is bigger than the number of CPU registers seen by the
destination we currently get the following error:

"failed to load cpu:cpreg_vmstate_array_len"

This series removes this cryptic message and explicitly outputs
which spurious registers cause the failures.

---

Available at:
https://github.com/eauger/qemu/tree/cpreg_vmstate_array_len_traces_v3_mitig_v2

v2 -> v3:
- clear the vmstate array pointers on post_save()
- improve the print_register_name comment
- Collecting remaining R-b's

v1 -> v2:
- Tool all comments from Peter on v1. See individual history logs
- Also added last patch which was previously included in the follow-up
  mitigation series ([PATCH v7 0/7] Mitigation of "failed to load
  cpu:cpreg_vmstate_array_len" migration failures) but actually belongs
  to those functional changes.


Eric Auger (7):
  vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC
  target/arm/machine: Use VMSTATE_VARRAY_INT32_ALLOC for cpreg arrays
  target/arm/kvm: Export kvm_print_register_name()
  target/arm/kvm: Tweak print_register_name() for arm64 system register
  target/arm/machine: Trace cpreg names which do not match on migration
  target/arm/machine: Trace all register mismatches
  target/arm/machine: Fix detection of unknown incoming cpregs

 include/migration/vmstate.h |  10 ++++
 target/arm/kvm_arm.h        |   9 +++
 target/arm/helper.c         |   5 --
 target/arm/kvm-stub.c       |   5 ++
 target/arm/kvm.c            |   9 +--
 target/arm/machine.c        | 115 ++++++++++++++++++++++++++++++------
 target/arm/whpx/whpx-all.c  |   7 ---
 target/arm/trace-events     |   3 +
 8 files changed, 127 insertions(+), 36 deletions(-)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/7] vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 2/7] target/arm/machine: Use VMSTATE_VARRAY_INT32_ALLOC for cpreg arrays Eric Auger
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

Already existing VMSTATE_VARRAY_INT32 requires an array to be
pre-allocated, however there are cases when the size is not known in
advance and there is no real need to enforce it.

Introduce VMSTATE_VARRAY_INT32_ALLOC as we currently have for UINT32
and UINT16.

The first user of this variant will be the target/arm/machine.c cpreg
indexes/values arrays.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 include/migration/vmstate.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 89f9f49d20a..62c2abd0c49 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -447,6 +447,16 @@ extern const VMStateInfo vmstate_info_qlist;
     .offset     = vmstate_offset_pointer(_state, _field, _type),     \
 }
 
+#define VMSTATE_VARRAY_INT32_ALLOC(_field, _state, _field_num, _version, _info, _type) {\
+    .name       = (stringify(_field)),                               \
+    .version_id = (_version),                                        \
+    .num_offset = vmstate_offset_value(_state, _field_num, int32_t), \
+    .info       = &(_info),                                          \
+    .size       = sizeof(_type),                                     \
+    .flags      = VMS_VARRAY_INT32 | VMS_POINTER | VMS_ALLOC,        \
+    .offset     = vmstate_offset_pointer(_state, _field, _type),     \
+}
+
 #define VMSTATE_VARRAY_UINT32_ALLOC(_field, _state, _field_num, _version, _info, _type) {\
     .name       = (stringify(_field)),                               \
     .version_id = (_version),                                        \
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/7] target/arm/machine: Use VMSTATE_VARRAY_INT32_ALLOC for cpreg arrays
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
  2026-03-04 10:15 ` [PATCH v3 1/7] vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 3/7] target/arm/kvm: Export kvm_print_register_name() Eric Auger
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

This removes the need for explicitly allocating cpreg_vmstate arrays.
On post save we simply point to cpreg arrays and set the length
accordingly.

Remove VMSTATE_VARRAY_INT32 for cpreg_vmstate_array_len as now
the array is dynamically allocated.

Also add a trace point on post_load to trace potential mismatch
between the number of incoming cpregs versus current ones.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

---

v1 -> v2:
- also modifies the allocation of cpureg_vmstate_* in
  target/arm/whpx/whpx-all.c
- added Peter's suggested comment on cpu_pre_save()
- free the the vmstate arrays on post_load
- add assert on pre_load
- fix comment aboy length check in machine.c

v2 -> v3:
- clear the pointers also in post_save()
---
 target/arm/helper.c        |  5 -----
 target/arm/kvm.c           |  5 -----
 target/arm/machine.c       | 45 +++++++++++++++++++++++++++-----------
 target/arm/whpx/whpx-all.c |  7 ------
 target/arm/trace-events    |  3 +++
 5 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 6bfab90981c..7389f2988c4 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -265,15 +265,10 @@ void arm_init_cpreg_list(ARMCPU *cpu)
     if (arraylen) {
         cpu->cpreg_indexes = g_new(uint64_t, arraylen);
         cpu->cpreg_values = g_new(uint64_t, arraylen);
-        cpu->cpreg_vmstate_indexes = g_new(uint64_t, arraylen);
-        cpu->cpreg_vmstate_values = g_new(uint64_t, arraylen);
     } else {
         cpu->cpreg_indexes = NULL;
         cpu->cpreg_values = NULL;
-        cpu->cpreg_vmstate_indexes = NULL;
-        cpu->cpreg_vmstate_values = NULL;
     }
-    cpu->cpreg_vmstate_array_len = arraylen;
     cpu->cpreg_array_len = 0;
 
     g_hash_table_foreach(cpu->cp_regs, add_cpreg_to_list, cpu);
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index eaa065d7261..555083e7aaf 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -864,12 +864,7 @@ static int kvm_arm_init_cpreg_list(ARMCPU *cpu)
 
     cpu->cpreg_indexes = g_renew(uint64_t, cpu->cpreg_indexes, arraylen);
     cpu->cpreg_values = g_renew(uint64_t, cpu->cpreg_values, arraylen);
-    cpu->cpreg_vmstate_indexes = g_renew(uint64_t, cpu->cpreg_vmstate_indexes,
-                                         arraylen);
-    cpu->cpreg_vmstate_values = g_renew(uint64_t, cpu->cpreg_vmstate_values,
-                                        arraylen);
     cpu->cpreg_array_len = arraylen;
-    cpu->cpreg_vmstate_array_len = arraylen;
 
     for (i = 0, arraylen = 0; i < rlp->n; i++) {
         uint64_t regidx = rlp->reg[i];
diff --git a/target/arm/machine.c b/target/arm/machine.c
index bbaae344492..d3d4f2ddc15 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -1,5 +1,6 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
+#include "trace.h"
 #include "qemu/error-report.h"
 #include "system/kvm.h"
 #include "system/tcg.h"
@@ -984,11 +985,14 @@ static int cpu_pre_save(void *opaque)
         }
     }
 
+    /*
+     * On outbound migration, send the data in our cpreg_{values,indexes}
+     * arrays. The migration code will not allocate anything, but just
+     * reads the data pointed to by the VMSTATE_VARRAY_INT32_ALLOC() fields.
+     */
+    cpu->cpreg_vmstate_indexes = cpu->cpreg_indexes;
+    cpu->cpreg_vmstate_values = cpu->cpreg_values;
     cpu->cpreg_vmstate_array_len = cpu->cpreg_array_len;
-    memcpy(cpu->cpreg_vmstate_indexes, cpu->cpreg_indexes,
-           cpu->cpreg_array_len * sizeof(uint64_t));
-    memcpy(cpu->cpreg_vmstate_values, cpu->cpreg_values,
-           cpu->cpreg_array_len * sizeof(uint64_t));
 
     return 0;
 }
@@ -1001,6 +1005,9 @@ static int cpu_post_save(void *opaque)
         pmu_op_finish(&cpu->env);
     }
 
+    cpu->cpreg_vmstate_indexes = NULL;
+    cpu->cpreg_vmstate_values = NULL;
+
     return 0;
 }
 
@@ -1034,6 +1041,9 @@ static int cpu_pre_load(void *opaque)
         pmu_op_start(env);
     }
 
+    g_assert(!cpu->cpreg_vmstate_indexes);
+    g_assert(!cpu->cpreg_vmstate_values);
+
     return 0;
 }
 
@@ -1043,6 +1053,9 @@ static int cpu_post_load(void *opaque, int version_id)
     CPUARMState *env = &cpu->env;
     int i, v;
 
+    trace_cpu_post_load(cpu->cpreg_vmstate_array_len,
+                        cpu->cpreg_array_len);
+
     /*
      * Handle migration compatibility from old QEMU which didn't
      * send the irq-line-state subsection. A QEMU without it did not
@@ -1094,6 +1107,11 @@ static int cpu_post_load(void *opaque, int version_id)
         }
     }
 
+    g_free(cpu->cpreg_vmstate_indexes);
+    g_free(cpu->cpreg_vmstate_values);
+    cpu->cpreg_vmstate_indexes = NULL;
+    cpu->cpreg_vmstate_values = NULL;
+
     /*
      * Misaligned thumb pc is architecturally impossible. Fail the
      * incoming migration. For TCG it would trigger the assert in
@@ -1167,16 +1185,17 @@ const VMStateDescription vmstate_arm_cpu = {
         VMSTATE_UINT32_ARRAY(env.fiq_regs, ARMCPU, 5),
         VMSTATE_UINT64_ARRAY(env.elr_el, ARMCPU, 4),
         VMSTATE_UINT64_ARRAY(env.sp_el, ARMCPU, 4),
-        /* The length-check must come before the arrays to avoid
-         * incoming data possibly overflowing the array.
+        /*
+         * The length must come before the arrays so we can
+         * allocate the arrays before their data arrives
          */
-        VMSTATE_INT32_POSITIVE_LE(cpreg_vmstate_array_len, ARMCPU),
-        VMSTATE_VARRAY_INT32(cpreg_vmstate_indexes, ARMCPU,
-                             cpreg_vmstate_array_len,
-                             0, vmstate_info_uint64, uint64_t),
-        VMSTATE_VARRAY_INT32(cpreg_vmstate_values, ARMCPU,
-                             cpreg_vmstate_array_len,
-                             0, vmstate_info_uint64, uint64_t),
+        VMSTATE_INT32(cpreg_vmstate_array_len, ARMCPU),
+        VMSTATE_VARRAY_INT32_ALLOC(cpreg_vmstate_indexes, ARMCPU,
+                                   cpreg_vmstate_array_len,
+                                   0, vmstate_info_uint64, uint64_t),
+        VMSTATE_VARRAY_INT32_ALLOC(cpreg_vmstate_values, ARMCPU,
+                                   cpreg_vmstate_array_len,
+                                   0, vmstate_info_uint64, uint64_t),
         VMSTATE_UINT64(env.exclusive_addr, ARMCPU),
         VMSTATE_UINT64(env.exclusive_val, ARMCPU),
         VMSTATE_UINT64(env.exclusive_high, ARMCPU),
diff --git a/target/arm/whpx/whpx-all.c b/target/arm/whpx/whpx-all.c
index bb94eac7bf8..c5b108166ac 100644
--- a/target/arm/whpx/whpx-all.c
+++ b/target/arm/whpx/whpx-all.c
@@ -783,12 +783,6 @@ int whpx_init_vcpu(CPUState *cpu)
                                      sregs_match_len);
     arm_cpu->cpreg_values = g_renew(uint64_t, arm_cpu->cpreg_values,
                                     sregs_match_len);
-    arm_cpu->cpreg_vmstate_indexes = g_renew(uint64_t,
-                                             arm_cpu->cpreg_vmstate_indexes,
-                                             sregs_match_len);
-    arm_cpu->cpreg_vmstate_values = g_renew(uint64_t,
-                                            arm_cpu->cpreg_vmstate_values,
-                                            sregs_match_len);
 
     memset(arm_cpu->cpreg_values, 0, sregs_match_len * sizeof(uint64_t));
 
@@ -807,7 +801,6 @@ int whpx_init_vcpu(CPUState *cpu)
         }
     }
     arm_cpu->cpreg_array_len = sregs_cnt;
-    arm_cpu->cpreg_vmstate_array_len = sregs_cnt;
 
     assert(write_cpustate_to_list(arm_cpu, false));
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 676d29fe516..2de0406f784 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -26,3 +26,6 @@ arm_powerctl_reset_cpu(uint64_t mp_aff) "cpu %" PRIu64
 
 # tcg/psci.c and hvf/hvf.c
 arm_psci_call(uint64_t x0, uint64_t x1, uint64_t x2, uint64_t x3, uint32_t cpuid) "PSCI Call x0=0x%016"PRIx64" x1=0x%016"PRIx64" x2=0x%016"PRIx64" x3=0x%016"PRIx64" cpuid=0x%x"
+
+# machine.c
+cpu_post_load(uint32_t cpreg_vmstate_array_len, uint32_t cpreg_array_len) "cpreg_vmstate_array_len=%d cpreg_array_len=%d"
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/7] target/arm/kvm: Export kvm_print_register_name()
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
  2026-03-04 10:15 ` [PATCH v3 1/7] vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC Eric Auger
  2026-03-04 10:15 ` [PATCH v3 2/7] target/arm/machine: Use VMSTATE_VARRAY_INT32_ALLOC for cpreg arrays Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 4/7] target/arm/kvm: Tweak print_register_name() for arm64 system register Eric Auger
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

We want to use kvm_print_register_name() in machine.c so
let's export the helper and implement a stub when kvm
is not enabled.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

---

v2 -> v3:
- added "The caller must free the string with g_free()."

v1 -> v2:
- add doc comment
- no code after g_assert_not_reached()
- use char * instead of gchar
---
 target/arm/kvm_arm.h  | 9 +++++++++
 target/arm/kvm-stub.c | 5 +++++
 target/arm/kvm.c      | 2 +-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 82ac2aae464..e7c40fb003e 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -231,4 +231,13 @@ void arm_cpu_kvm_set_irq(void *arm_cpu, int irq, int level);
 
 void arm_gic_cap_kvm_probe(GICCapability *v2, GICCapability *v3);
 
+/*
+ * kvm_print_register_name:
+ * @regidx: register KVM index
+ *
+ * Returns a human-readable string representing this register
+ * The caller must free the string with g_free().
+ */
+char *kvm_print_register_name(uint64_t regidx);
+
 #endif
diff --git a/target/arm/kvm-stub.c b/target/arm/kvm-stub.c
index 169ef5f2063..88cbe8d85c4 100644
--- a/target/arm/kvm-stub.c
+++ b/target/arm/kvm-stub.c
@@ -114,3 +114,8 @@ void arm_gic_cap_kvm_probe(GICCapability *v2, GICCapability *v3)
 {
     g_assert_not_reached();
 }
+
+char *kvm_print_register_name(uint64_t regidx)
+{
+    g_assert_not_reached();
+}
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 555083e7aaf..11f6f2dff09 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -970,7 +970,7 @@ static gchar *kvm_print_sve_register_name(uint64_t regidx)
     }
 }
 
-static gchar *kvm_print_register_name(uint64_t regidx)
+char *kvm_print_register_name(uint64_t regidx)
 {
         switch ((regidx & KVM_REG_ARM_COPROC_MASK)) {
         case KVM_REG_ARM_CORE:
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/7] target/arm/kvm: Tweak print_register_name() for arm64 system register
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
                   ` (2 preceding siblings ...)
  2026-03-04 10:15 ` [PATCH v3 3/7] target/arm/kvm: Export kvm_print_register_name() Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 5/7] target/arm/machine: Trace cpreg names which do not match on migration Eric Auger
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

As opposed to other register types, arm64 system register decoding
is not introduced by any 'register' mention which can lead to
unfriendly user-facing traces.  Let's add "system register"

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 11f6f2dff09..d4a68874b88 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -978,7 +978,7 @@ char *kvm_print_register_name(uint64_t regidx)
         case KVM_REG_ARM_DEMUX:
             return g_strdup_printf("demuxed reg %"PRIx64, regidx);
         case KVM_REG_ARM64_SYSREG:
-            return g_strdup_printf("op0:%d op1:%d crn:%d crm:%d op2:%d",
+            return g_strdup_printf("system register op0:%d op1:%d crn:%d crm:%d op2:%d",
                                    CP_REG_ARM64_SYSREG_OP(regidx, OP0),
                                    CP_REG_ARM64_SYSREG_OP(regidx, OP1),
                                    CP_REG_ARM64_SYSREG_OP(regidx, CRN),
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 5/7] target/arm/machine: Trace cpreg names which do not match on migration
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
                   ` (3 preceding siblings ...)
  2026-03-04 10:15 ` [PATCH v3 4/7] target/arm/kvm: Tweak print_register_name() for arm64 system register Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 6/7] target/arm/machine: Trace all register mismatches Eric Auger
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

Whenever there is a mismatch between cpreg indexes in the incoming
stream and cpregs exposed by the destination output the name of
the register. We use a print_register_name() wrapper helper. At the
moment we are only able to do a nice decoding of the index for
KVM regs.

Without this patch, the error would be:
qemu-system-aarch64: load of migration failed: Operation not permitted:
error while loading state for instance 0x0 of device 'cpu': post load
hook failed for: cpu, version_id: 22, minimum_version: 22, ret: -1
which is not helpful for the end user to understand the actual
issue.

This patch adds the actual information about the probme:
qemu-system-aarch64: cpu_post_load: system register
op0:3 op1:0 crn:2 crm:0 op2:3 in the incoming stream but
unknown on the destination, fail migration

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

---

v1 -> v2:
- replaced ',' by ':' in the traces
- added system register in print_register_name when kvm is not enabled
---
 target/arm/machine.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/target/arm/machine.c b/target/arm/machine.c
index d3d4f2ddc15..d9b65b5eed6 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -1,5 +1,6 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
+#include "cpregs.h"
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "system/kvm.h"
@@ -1047,6 +1048,15 @@ static int cpu_pre_load(void *opaque)
     return 0;
 }
 
+static gchar *print_register_name(uint64_t kvm_regidx)
+{
+    if (kvm_enabled()) {
+        return kvm_print_register_name(kvm_regidx);
+    } else {
+        return g_strdup_printf("system register 0x%x", kvm_to_cpreg_id(kvm_regidx));
+    }
+}
+
 static int cpu_post_load(void *opaque, int version_id)
 {
     ARMCPU *cpu = opaque;
@@ -1085,11 +1095,18 @@ static int cpu_post_load(void *opaque, int version_id)
     for (i = 0, v = 0; i < cpu->cpreg_array_len
              && v < cpu->cpreg_vmstate_array_len; i++) {
         if (cpu->cpreg_vmstate_indexes[v] > cpu->cpreg_indexes[i]) {
-            /* register in our list but not incoming : skip it */
+            g_autofree gchar *name = print_register_name(cpu->cpreg_indexes[i]);
+
+            warn_report("%s: %s "
+                        "expected by the destination but not in the incoming stream: "
+                        "skip it", __func__, name);
             continue;
         }
         if (cpu->cpreg_vmstate_indexes[v] < cpu->cpreg_indexes[i]) {
-            /* register in their list but not ours: fail migration */
+            g_autofree gchar *name = print_register_name(cpu->cpreg_vmstate_indexes[v]);
+
+            error_report("%s: %s in the incoming stream but unknown on the destination: "
+                         "fail migration", __func__, name);
             return -1;
         }
         /* matching register, copy the value over */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 6/7] target/arm/machine: Trace all register mismatches
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
                   ` (4 preceding siblings ...)
  2026-03-04 10:15 ` [PATCH v3 5/7] target/arm/machine: Trace cpreg names which do not match on migration Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-04 10:15 ` [PATCH v3 7/7] target/arm/machine: Fix detection of unknown incoming cpregs Eric Auger
  2026-03-05 10:30 ` [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Peter Maydell
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

At the moment, cpu_post_load() exits with error on the first
catch of unexpected register in the incoming stream. Let the code
go further and trace all the issues before exiting.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/machine.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/target/arm/machine.c b/target/arm/machine.c
index d9b65b5eed6..4d9158e6977 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -1061,6 +1061,7 @@ static int cpu_post_load(void *opaque, int version_id)
 {
     ARMCPU *cpu = opaque;
     CPUARMState *env = &cpu->env;
+    bool fail = false;
     int i, v;
 
     trace_cpu_post_load(cpu->cpreg_vmstate_array_len,
@@ -1093,13 +1094,14 @@ static int cpu_post_load(void *opaque, int version_id)
      */
 
     for (i = 0, v = 0; i < cpu->cpreg_array_len
-             && v < cpu->cpreg_vmstate_array_len; i++) {
+             && v < cpu->cpreg_vmstate_array_len;) {
         if (cpu->cpreg_vmstate_indexes[v] > cpu->cpreg_indexes[i]) {
             g_autofree gchar *name = print_register_name(cpu->cpreg_indexes[i]);
 
             warn_report("%s: %s "
                         "expected by the destination but not in the incoming stream: "
                         "skip it", __func__, name);
+            i++;
             continue;
         }
         if (cpu->cpreg_vmstate_indexes[v] < cpu->cpreg_indexes[i]) {
@@ -1107,12 +1109,18 @@ static int cpu_post_load(void *opaque, int version_id)
 
             error_report("%s: %s in the incoming stream but unknown on the destination: "
                          "fail migration", __func__, name);
-            return -1;
+            v++;
+            fail = true;
+            continue;
         }
         /* matching register, copy the value over */
         cpu->cpreg_values[i] = cpu->cpreg_vmstate_values[v];
+        i++;
         v++;
     }
+    if (fail) {
+        return -1;
+    }
 
     if (kvm_enabled()) {
         if (!kvm_arm_cpu_post_load(cpu)) {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 7/7] target/arm/machine: Fix detection of unknown incoming cpregs
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
                   ` (5 preceding siblings ...)
  2026-03-04 10:15 ` [PATCH v3 6/7] target/arm/machine: Trace all register mismatches Eric Auger
@ 2026-03-04 10:15 ` Eric Auger
  2026-03-05 10:30 ` [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Peter Maydell
  7 siblings, 0 replies; 9+ messages in thread
From: Eric Auger @ 2026-03-04 10:15 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	cohuck, sebott, peterx, philmd, alex.bennee

Currently the check of cpreg index matches fail to detect
a situation where the length of both arrays is same but
- destination has an extra register not found in the incoming stream (idx1)
- source has an extra register not found in the destination (idx2)
  where idx1 < = idx2
Normally this should fail but it does not.

Fix the logic to scan all indexes.

Fixes: 721fae12536 ("target-arm: Convert TCG to using (index,value) list for cp migration")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/machine.c | 61 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 12 deletions(-)

diff --git a/target/arm/machine.c b/target/arm/machine.c
index 4d9158e6977..476dad00ee7 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -1057,6 +1057,35 @@ static gchar *print_register_name(uint64_t kvm_regidx)
     }
 }
 
+/*
+ * Handle the situation where @kvmidx is on destination but not
+ * in the incoming stream. This never fails the migration.
+ */
+static void handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx)
+{
+    g_autofree gchar *name = print_register_name(kvmidx);
+
+    warn_report("%s: %s "
+                "expected by the destination but not in the incoming stream: "
+                 "skip it", __func__, name);
+}
+
+/*
+ * Handle the situation where @kvmidx is in the incoming stream
+ * but not on destination. This currently fails the migration but
+ * we plan to accomodate some exceptions, hence the boolean returned value.
+ */
+static bool handle_cpreg_only_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx)
+{
+    g_autofree gchar *name = print_register_name(kvmidx);
+    bool fail = true;
+
+    error_report("%s: %s in the incoming stream but unknown on the "
+                 "destination: fail migration", __func__, name);
+
+    return fail;
+}
+
 static int cpu_post_load(void *opaque, int version_id)
 {
     ARMCPU *cpu = opaque;
@@ -1096,21 +1125,12 @@ static int cpu_post_load(void *opaque, int version_id)
     for (i = 0, v = 0; i < cpu->cpreg_array_len
              && v < cpu->cpreg_vmstate_array_len;) {
         if (cpu->cpreg_vmstate_indexes[v] > cpu->cpreg_indexes[i]) {
-            g_autofree gchar *name = print_register_name(cpu->cpreg_indexes[i]);
-
-            warn_report("%s: %s "
-                        "expected by the destination but not in the incoming stream: "
-                        "skip it", __func__, name);
-            i++;
+            handle_cpreg_missing_in_incoming_stream(cpu, cpu->cpreg_indexes[i++]);
             continue;
         }
         if (cpu->cpreg_vmstate_indexes[v] < cpu->cpreg_indexes[i]) {
-            g_autofree gchar *name = print_register_name(cpu->cpreg_vmstate_indexes[v]);
-
-            error_report("%s: %s in the incoming stream but unknown on the destination: "
-                         "fail migration", __func__, name);
-            v++;
-            fail = true;
+            fail = handle_cpreg_only_in_incoming_stream(cpu,
+                                                        cpu->cpreg_vmstate_indexes[v++]);
             continue;
         }
         /* matching register, copy the value over */
@@ -1118,6 +1138,23 @@ static int cpu_post_load(void *opaque, int version_id)
         i++;
         v++;
     }
+
+    /*
+     * if we have reached the end of the incoming array but there are
+     * still regs in cpreg, continue parsing the regs which are missing
+     * in the input stream
+     */
+    for ( ; i < cpu->cpreg_array_len; i++) {
+        handle_cpreg_missing_in_incoming_stream(cpu, cpu->cpreg_indexes[i]);
+    }
+    /*
+     * if we have reached the end of the cpreg array but there are
+     * still regs in the input stream, continue parsing the vmstate array
+     */
+    for ( ; v < cpu->cpreg_vmstate_array_len; v++) {
+        fail = handle_cpreg_only_in_incoming_stream(cpu,
+                                                    cpu->cpreg_vmstate_indexes[v]);
+    }
     if (fail) {
         return -1;
     }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch
  2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
                   ` (6 preceding siblings ...)
  2026-03-04 10:15 ` [PATCH v3 7/7] target/arm/machine: Fix detection of unknown incoming cpregs Eric Auger
@ 2026-03-05 10:30 ` Peter Maydell
  7 siblings, 0 replies; 9+ messages in thread
From: Peter Maydell @ 2026-03-05 10:30 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, cohuck, sebott, peterx,
	philmd, alex.bennee

On Wed, 4 Mar 2026 at 10:16, Eric Auger <eric.auger@redhat.com> wrote:
>
> This series comes as a follow-up of discussions held in
> [PATCH v6 00/11] Mitigation of "failed to load
> cpu:cpreg_vmstate_array_len" migration failures
> (https://lore.kernel.org/all/20260126165445.3033335-1-eric.auger@redhat.com/)
>
> It only covers the improvement of the traces. Actual mitigations
> are handled in a follow-up series:
> Mitigation of "failed to load cpu:cpreg_vmstate_array_len" (v8)
>
> When the number of CPU registers received in an incoming migration
> stream is bigger than the number of CPU registers seen by the
> destination we currently get the following error:
>
> "failed to load cpu:cpreg_vmstate_array_len"
>
> This series removes this cryptic message and explicitly outputs
> which spurious registers cause the failures.
>
> ---



Applied to target-arm.next, thanks.

-- PMM


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-05 10:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04 10:15 [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Eric Auger
2026-03-04 10:15 ` [PATCH v3 1/7] vmstate: Introduce VMSTATE_VARRAY_INT32_ALLOC Eric Auger
2026-03-04 10:15 ` [PATCH v3 2/7] target/arm/machine: Use VMSTATE_VARRAY_INT32_ALLOC for cpreg arrays Eric Auger
2026-03-04 10:15 ` [PATCH v3 3/7] target/arm/kvm: Export kvm_print_register_name() Eric Auger
2026-03-04 10:15 ` [PATCH v3 4/7] target/arm/kvm: Tweak print_register_name() for arm64 system register Eric Auger
2026-03-04 10:15 ` [PATCH v3 5/7] target/arm/machine: Trace cpreg names which do not match on migration Eric Auger
2026-03-04 10:15 ` [PATCH v3 6/7] target/arm/machine: Trace all register mismatches Eric Auger
2026-03-04 10:15 ` [PATCH v3 7/7] target/arm/machine: Fix detection of unknown incoming cpregs Eric Auger
2026-03-05 10:30 ` [PATCH v3 0/7] Improve traces on migration failure due to cpreg number mismatch Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox