qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
@ 2025-09-23 10:41 Paolo Bonzini
  2025-09-23 10:41 ` [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities Paolo Bonzini
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Paolo Bonzini @ 2025-09-23 10:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: hector.cao, lk, berrange

Add two compatibility properties to restore legacy behavior of machine types
prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:

- ARCH_CAPABILITIES should not be autoenabled when the CPU model specifies AMD
  as the vendor

- specifying PDCM without PMU now causes an error, instead of being silently
  dropped in cpu_x86_cpuid.

Note, I only tested this lightly.

Paolo

Hector Cao (1):
  target/i386: add compatibility property for pdcm feature

Paolo Bonzini (1):
  target/i386: add compatibility property for arch_capabilities

 target/i386/cpu.h     | 12 ++++++++++++
 hw/i386/pc.c          |  2 ++
 target/i386/cpu.c     | 32 +++++++++++++++++++++++++++++---
 target/i386/kvm/kvm.c |  6 +-----
 4 files changed, 44 insertions(+), 8 deletions(-)

-- 
2.51.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities
  2025-09-23 10:41 [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Paolo Bonzini
@ 2025-09-23 10:41 ` Paolo Bonzini
  2025-09-25 16:09   ` Zhao Liu
  2025-09-23 10:41 ` [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature Paolo Bonzini
  2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
  2 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2025-09-23 10:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: hector.cao, lk, berrange

Prior to v10.1, if requested by user, arch-capabilities is always on
despite the fact that CPUID advertises it to be off/unvailable.
This causes a migration issue for VMs that are run on a machine
without arch-capabilities and expect this feature to be present
on the destination host with QEMU 10.1.

Add a compatibility property to restore the legacy behavior for all
machines with version prior to 10.1.

Co-authored-by: Hector Cao <hector.cao@canonical.com>
Signed-off-by: Hector Cao <hector.cao@canonical.com>
Fixes: d3a24134e37 ("target/i386: do not expose ARCH_CAPABILITIES on AMD CPU", 2025-07-17)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h     |  6 ++++++
 hw/i386/pc.c          |  1 +
 target/i386/cpu.c     | 17 +++++++++++++++++
 target/i386/kvm/kvm.c |  6 +-----
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e0be7a74068..414ca968e84 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2314,6 +2314,12 @@ struct ArchCPU {
     /* Forcefully disable KVM PV features not exposed in guest CPUIDs */
     bool kvm_pv_enforce_cpuid;
 
+    /*
+     * Expose arch-capabilities unconditionally even on AMD models, for backwards
+     * compatibility with QEMU <10.1.
+     */
+    bool arch_cap_always_on;
+
     /* Number of physical address bits supported */
     uint32_t phys_bits;
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index bc048a6d137..d7f48150fdd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -87,6 +87,7 @@ const size_t pc_compat_10_1_len = G_N_ELEMENTS(pc_compat_10_1);
 GlobalProperty pc_compat_10_0[] = {
     { TYPE_X86_CPU, "x-consistent-cache", "false" },
     { TYPE_X86_CPU, "x-vendor-cpuid-only-v2", "false" },
+    { TYPE_X86_CPU, "x-arch-cap-always-on", "true" },
 };
 const size_t pc_compat_10_0_len = G_N_ELEMENTS(pc_compat_10_0);
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6d85149e6e1..fe369bb1284 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7539,6 +7539,20 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w)
 #endif
         break;
 
+    case FEAT_7_0_EDX:
+        /*
+         * Windows does not like ARCH_CAPABILITIES on AMD machines at all.
+         * Do not show the fake ARCH_CAPABILITIES MSR that KVM sets up,
+         * except if needed for migration.
+         *
+         * When arch_cap_always_on is removed, this tweak can move to
+         * kvm_arch_get_supported_cpuid.
+         */
+        if (cpu && IS_AMD_CPU(&cpu->env) && !cpu->arch_cap_always_on) {
+            unavail = CPUID_7_0_EDX_ARCH_CAPABILITIES;
+        }
+        break;
+
     default:
         break;
     }
@@ -10004,6 +10018,9 @@ static const Property x86_cpu_properties[] = {
                      true),
     DEFINE_PROP_BOOL("x-l1-cache-per-thread", X86CPU, l1_cache_per_core, true),
     DEFINE_PROP_BOOL("x-force-cpuid-0x1f", X86CPU, force_cpuid_0x1f, false),
+
+    DEFINE_PROP_BOOL("x-arch-cap-always-on", X86CPU,
+                     arch_cap_always_on, false),
 };
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6a3a1c1ed8e..db40caa3412 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -503,12 +503,8 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
          * Linux v4.17-v4.20 incorrectly return ARCH_CAPABILITIES on SVM hosts.
          * We can detect the bug by checking if MSR_IA32_ARCH_CAPABILITIES is
          * returned by KVM_GET_MSR_INDEX_LIST.
-         *
-         * But also, because Windows does not like ARCH_CAPABILITIES on AMD
-         * mcahines at all, do not show the fake ARCH_CAPABILITIES MSR that
-         * KVM sets up.
          */
-        if (!has_msr_arch_capabs || !(edx & CPUID_7_0_EDX_ARCH_CAPABILITIES)) {
+        if (!has_msr_arch_capabs) {
             ret &= ~CPUID_7_0_EDX_ARCH_CAPABILITIES;
         }
     } else if (function == 7 && index == 1 && reg == R_EAX) {
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature
  2025-09-23 10:41 [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Paolo Bonzini
  2025-09-23 10:41 ` [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities Paolo Bonzini
@ 2025-09-23 10:41 ` Paolo Bonzini
  2025-09-25 16:10   ` Zhao Liu
  2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
  2 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2025-09-23 10:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: hector.cao, lk, berrange

From: Hector Cao <hector.cao@canonical.com>

The pdcm feature is supposed to be disabled when PMU is not
available. Up until v10.1, pdcm feature is enabled even when PMU
is off. This behavior has been fixed but this change breaks the
migration of VMs that are run with QEMU < 10.0 and expect the pdcm
feature to be enabled on the destination host.

This commit restores the legacy behavior for machines with version
prior to 10.1 to allow the migration from older QEMU to QEMU 10.1.

Signed-off-by: Hector Cao <hector.cao@canonical.com>
Link: https://lore.kernel.org/r/20250910115733.21149-3-hector.cao@canonical.com
Fixes: e68ec298090 ("i386/cpu: Move adjustment of CPUID_EXT_PDCM before feature_dependencies[] check", 2025-06-20)
[Move property from migration object to CPU. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h |  6 ++++++
 hw/i386/pc.c      |  1 +
 target/i386/cpu.c | 15 ++++++++++++---
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 414ca968e84..42168f1d6d8 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2320,6 +2320,12 @@ struct ArchCPU {
      */
     bool arch_cap_always_on;
 
+    /*
+     * Backwards compatibility with QEMU <10.1. The PDCM feature is now disabled when
+     * PMU is not available, but prior to 10.1 it was enabled even if PMU is off.
+     */
+    bool pdcm_on_even_without_pmu;
+
     /* Number of physical address bits supported */
     uint32_t phys_bits;
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d7f48150fdd..4668918746e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -88,6 +88,7 @@ GlobalProperty pc_compat_10_0[] = {
     { TYPE_X86_CPU, "x-consistent-cache", "false" },
     { TYPE_X86_CPU, "x-vendor-cpuid-only-v2", "false" },
     { TYPE_X86_CPU, "x-arch-cap-always-on", "true" },
+    { TYPE_X86_CPU, "x-pdcm-on-even-without-pmu", "true" },
 };
 const size_t pc_compat_10_0_len = G_N_ELEMENTS(pc_compat_10_0);
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fe369bb1284..ab18de894e4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7908,6 +7908,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             /* Fixup overflow: max value for bits 23-16 is 255. */
             *ebx |= MIN(num, 255) << 16;
         }
+        if (cpu->pdcm_on_even_without_pmu) {
+            if (!cpu->enable_pmu) {
+                *ecx &= ~CPUID_EXT_PDCM;
+            }
+        }
         break;
     case 2: { /* cache info: needed for Pentium Pro compatibility */
         const CPUCaches *caches;
@@ -8958,9 +8963,11 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
         }
     }
 
-    /* PDCM is fixed1 bit for TDX */
-    if (!cpu->enable_pmu && !is_tdx_vm()) {
-        env->features[FEAT_1_ECX] &= ~CPUID_EXT_PDCM;
+    if (!cpu->pdcm_on_even_without_pmu) {
+        /* PDCM is fixed1 bit for TDX */
+        if (!cpu->enable_pmu && !is_tdx_vm()) {
+            env->features[FEAT_1_ECX] &= ~CPUID_EXT_PDCM;
+        }
     }
 
     for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
@@ -10021,6 +10028,8 @@ static const Property x86_cpu_properties[] = {
 
     DEFINE_PROP_BOOL("x-arch-cap-always-on", X86CPU,
                      arch_cap_always_on, false),
+    DEFINE_PROP_BOOL("x-pdcm-on-even-without-pmu", X86CPU,
+                     pdcm_on_even_without_pmu, false),
 };
 
 #ifndef CONFIG_USER_ONLY
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities
  2025-09-23 10:41 ` [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities Paolo Bonzini
@ 2025-09-25 16:09   ` Zhao Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Zhao Liu @ 2025-09-25 16:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, hector.cao, lk, berrange

On Tue, Sep 23, 2025 at 12:41:35PM +0200, Paolo Bonzini wrote:
> Date: Tue, 23 Sep 2025 12:41:35 +0200
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [RFT PATCH v2 1/2] target/i386: add compatibility property for
>  arch_capabilities
> X-Mailer: git-send-email 2.51.0
> 
> Prior to v10.1, if requested by user, arch-capabilities is always on
> despite the fact that CPUID advertises it to be off/unvailable.
> This causes a migration issue for VMs that are run on a machine
> without arch-capabilities and expect this feature to be present
> on the destination host with QEMU 10.1.
> 
> Add a compatibility property to restore the legacy behavior for all
> machines with version prior to 10.1.
> 
> Co-authored-by: Hector Cao <hector.cao@canonical.com>
> Signed-off-by: Hector Cao <hector.cao@canonical.com>
> Fixes: d3a24134e37 ("target/i386: do not expose ARCH_CAPABILITIES on AMD CPU", 2025-07-17)
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.h     |  6 ++++++
>  hw/i386/pc.c          |  1 +
>  target/i386/cpu.c     | 17 +++++++++++++++++
>  target/i386/kvm/kvm.c |  6 +-----
>  4 files changed, 25 insertions(+), 5 deletions(-)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature
  2025-09-23 10:41 ` [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature Paolo Bonzini
@ 2025-09-25 16:10   ` Zhao Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Zhao Liu @ 2025-09-25 16:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, hector.cao, lk, berrange

On Tue, Sep 23, 2025 at 12:41:36PM +0200, Paolo Bonzini wrote:
> Date: Tue, 23 Sep 2025 12:41:36 +0200
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [RFT PATCH v2 2/2] target/i386: add compatibility property for
>  pdcm feature
> X-Mailer: git-send-email 2.51.0
> 
> From: Hector Cao <hector.cao@canonical.com>
> 
> The pdcm feature is supposed to be disabled when PMU is not
> available. Up until v10.1, pdcm feature is enabled even when PMU
> is off. This behavior has been fixed but this change breaks the
> migration of VMs that are run with QEMU < 10.0 and expect the pdcm
> feature to be enabled on the destination host.
> 
> This commit restores the legacy behavior for machines with version
> prior to 10.1 to allow the migration from older QEMU to QEMU 10.1.
> 
> Signed-off-by: Hector Cao <hector.cao@canonical.com>
> Link: https://lore.kernel.org/r/20250910115733.21149-3-hector.cao@canonical.com
> Fixes: e68ec298090 ("i386/cpu: Move adjustment of CPUID_EXT_PDCM before feature_dependencies[] check", 2025-06-20)
> [Move property from migration object to CPU. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.h |  6 ++++++
>  hw/i386/pc.c      |  1 +
>  target/i386/cpu.c | 15 ++++++++++++---
>  3 files changed, 19 insertions(+), 3 deletions(-)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-09-23 10:41 [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Paolo Bonzini
  2025-09-23 10:41 ` [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities Paolo Bonzini
  2025-09-23 10:41 ` [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature Paolo Bonzini
@ 2025-09-25 16:17 ` Zhao Liu
  2025-09-28  9:41   ` Paolo Bonzini
                     ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Zhao Liu @ 2025-09-25 16:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, hector.cao, lk, berrange, Michael Tokarev

On Tue, Sep 23, 2025 at 12:41:34PM +0200, Paolo Bonzini wrote:
> Date: Tue, 23 Sep 2025 12:41:34 +0200
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [RFT PATCH v2 0/2] Fix cross migration issue with missing
>  features: pdcm, arch-capabilities
> X-Mailer: git-send-email 2.51.0
> 
> Add two compatibility properties to restore legacy behavior of machine types
> prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:
> 
> - ARCH_CAPABILITIES should not be autoenabled when the CPU model specifies AMD
>   as the vendor
> 
> - specifying PDCM without PMU now causes an error, instead of being silently
>   dropped in cpu_x86_cpuid.
> 
> Note, I only tested this lightly.

Sorry for late.

I found the previous 2 fixes were merged into stable 10.0:

24778b1c7ee7aca9721ed4757b0e0df0c16390f7
3d26cb65c27190e57637644ecf6c96b8c3d246a3

Should stable 10.0 revert these 2 fixes, to ensure migration
compatibility?

(+Michael)

Regards,
Zhao



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
@ 2025-09-28  9:41   ` Paolo Bonzini
  2025-10-08  8:47     ` Michael Tokarev
  2025-10-08 13:32   ` Michael Tokarev
  2025-10-10 17:40   ` Michael Tokarev
  2 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2025-09-28  9:41 UTC (permalink / raw)
  To: Zhao Liu; +Cc: qemu-devel, hector.cao, lk, berrange, Michael Tokarev

On 9/25/25 18:17, Zhao Liu wrote:
> On Tue, Sep 23, 2025 at 12:41:34PM +0200, Paolo Bonzini wrote:
>> Date: Tue, 23 Sep 2025 12:41:34 +0200
>> From: Paolo Bonzini <pbonzini@redhat.com>
>> Subject: [RFT PATCH v2 0/2] Fix cross migration issue with missing
>>   features: pdcm, arch-capabilities
>> X-Mailer: git-send-email 2.51.0
>>
>> Add two compatibility properties to restore legacy behavior of machine types
>> prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:
>>
>> - ARCH_CAPABILITIES should not be autoenabled when the CPU model specifies AMD
>>    as the vendor
>>
>> - specifying PDCM without PMU now causes an error, instead of being silently
>>    dropped in cpu_x86_cpuid.
>>
>> Note, I only tested this lightly.
> 
> Sorry for late.
> 
> I found the previous 2 fixes were merged into stable 10.0:
> 
> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
> 3d26cb65c27190e57637644ecf6c96b8c3d246a3

Yes, thanks for noticing it Zhao.  Because we cannot apply the machine 
type changes to 10.0, those two patches have to be reverted.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-09-28  9:41   ` Paolo Bonzini
@ 2025-10-08  8:47     ` Michael Tokarev
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Tokarev @ 2025-10-08  8:47 UTC (permalink / raw)
  To: Paolo Bonzini, Zhao Liu
  Cc: qemu-devel, hector.cao, lk, berrange, Xiaoyao Li, Qian Wen

On 9/28/25 12:41, Paolo Bonzini wrote:
> On 9/25/25 18:17, Zhao Liu wrote:
>> On Tue, Sep 23, 2025 at 12:41:34PM +0200, Paolo Bonzini wrote:
>>> Date: Tue, 23 Sep 2025 12:41:34 +0200
>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>> Subject: [RFT PATCH v2 0/2] Fix cross migration issue with missing
>>>   features: pdcm, arch-capabilities
>>> X-Mailer: git-send-email 2.51.0
>>>
>>> Add two compatibility properties to restore legacy behavior of 
>>> machine types
>>> prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:
...>> I found the previous 2 fixes were merged into stable 10.0:
>>
>> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
>> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
> 
> Yes, thanks for noticing it Zhao.  Because we cannot apply the machine 
> type changes to 10.0, those two patches have to be reverted.

Hmm.  And I missed this message too, noticed it just now
(right after tagging 10.0.5, - what a shame on me!).

What's the problem with these patches in 10.0.x?

IIRC, 24778b1c7e "target/i386: do not expose ARCH_CAPABILITIES on AMD
CPU" fixed a real issue we've hit, but I don't remember the details
off my head anymore.  It can be easily reverted, with the bug it fixed,
returned.

3d26cb65c2 "i386/cpu: Move adjustment of CPUID_EXT_PDCM before
feature_dependencies[] check" seemed innocent enough to me, and it
was on the way of other changes in this same area (notable 53f100eeec
"i386/cpu: Fix number of addressable IDs field.." and its subsequent
fixup).  Reverting this one requires some editing.

BTW, I'm Cc'in all involved people when I pick a path to each of
the stable series, - please let me know if I can make this process
more obvious.  Also, 10.0.x supposed to be a long-term series.

Thanks!

/mjt


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
  2025-09-28  9:41   ` Paolo Bonzini
@ 2025-10-08 13:32   ` Michael Tokarev
  2025-10-10 17:40   ` Michael Tokarev
  2 siblings, 0 replies; 15+ messages in thread
From: Michael Tokarev @ 2025-10-08 13:32 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini; +Cc: qemu-devel, hector.cao, lk, berrange

On 9/25/25 19:17, Zhao Liu wrote:
...

> I found the previous 2 fixes were merged into stable 10.0:
> 
> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
> 
> Should stable 10.0 revert these 2 fixes, to ensure migration
> compatibility?

Yes, these two are picked up for stable 10.0.x series.
I haven't realized they'll bring issues with migration.

These 2 patches are in qemu 10.0.4 (with 10.0.5 tagged
today).  If we revert them for 10.0.6, will it break
migration between 10.0.4 and 10.0.6?  How about migration
between 10.0.3 and 10.0.4?

Thanks,

/mjt



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
  2025-09-28  9:41   ` Paolo Bonzini
  2025-10-08 13:32   ` Michael Tokarev
@ 2025-10-10 17:40   ` Michael Tokarev
  2025-10-13  7:22     ` Zhao Liu
  2 siblings, 1 reply; 15+ messages in thread
From: Michael Tokarev @ 2025-10-10 17:40 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini
  Cc: qemu-devel, hector.cao, lk, berrange, Peter Maydell, qemu-stable

On 9/25/25 19:17, Zhao Liu wrote:
> On Tue, Sep 23, 2025 at 12:41:34PM +0200, Paolo Bonzini wrote:
>> Date: Tue, 23 Sep 2025 12:41:34 +0200
>> From: Paolo Bonzini <pbonzini@redhat.com>
>> Subject: [RFT PATCH v2 0/2] Fix cross migration issue with missing
>>   features: pdcm, arch-capabilities
>> X-Mailer: git-send-email 2.51.0
>>
>> Add two compatibility properties to restore legacy behavior of machine types
>> prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:
>>
>> - ARCH_CAPABILITIES should not be autoenabled when the CPU model specifies AMD
>>    as the vendor
>>
>> - specifying PDCM without PMU now causes an error, instead of being silently
>>    dropped in cpu_x86_cpuid.
>>
>> Note, I only tested this lightly.
> 
> Sorry for late.
> 
> I found the previous 2 fixes were merged into stable 10.0:
> 
> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
> 
> Should stable 10.0 revert these 2 fixes, to ensure migration
> compatibility?

Now when I think about it.

There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
with these 2 patches already.  Reverting them in 10.0 will make
10.0 to be non-migratable with itself (10.0.5 can't be migrated
to 10.0.6 if we'll release 10.0.6 with these 2 patches reverted).

Also, as far as I can see (and I asked about this some 5 times
already, with no one answering - is it that difficult?) - we
should pick this series (pdcm, arch-capabilities) to 10.1.x stable
series too, since we can't migrate from previous versions to 10.1
which has the two changes mentioned above.

It looks to me - since the breakage is already done, and both 10.0
and 10.1 is broken, we should declare the current situation as a
status quo, and do the following:

1. keep the above mentioned 24778b1c7ee7a and 3d26cb65c27190e5 in
    10.0.x (instead of reverting them);

2. pick up this 2 patches (fix cross migration issue with missing
    pdcm, arch-capabilities) to 10.1.x (it should be done either way,
    I think);

3. on top of these 2 "missing features: pdcm, arch-capabilities",
    make the crossing line for before-10.0, not for before-10.1 series, -
    ie, consider 10.0 *also* has these properties, but 9.2 and before
    are not.

This too will make 10.0.5 => 10.0.6 non-migrateable, just like if
I'll revert 24778b1c7ee7a and 3d26cb65c27190e5 in 10.0.  But this way
we will also have these bugs fixed in 10.0.  And all subsequent
versions of 10.0 and 10.1 will be migratable again.

Please, don't be quiet this time, - I need your comments for this
matter, because I don't understand well enough how migration works.

Cc'ing Peter too, because I'm stuck here and no my questions are
getting answered.. maybe he can help to at least clear some questions.

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-10-10 17:40   ` Michael Tokarev
@ 2025-10-13  7:22     ` Zhao Liu
  2025-10-13 17:22       ` Michael Tokarev
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Liu @ 2025-10-13  7:22 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Paolo Bonzini, qemu-devel, hector.cao, lk, berrange,
	Peter Maydell, qemu-stable

On Fri, Oct 10, 2025 at 08:40:56PM +0300, Michael Tokarev wrote:
> Date: Fri, 10 Oct 2025 20:40:56 +0300
> From: Michael Tokarev <mjt@tls.msk.ru>
> Subject: Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing
>  features: pdcm, arch-capabilities
> 
> On 9/25/25 19:17, Zhao Liu wrote:
> > On Tue, Sep 23, 2025 at 12:41:34PM +0200, Paolo Bonzini wrote:
> > > Date: Tue, 23 Sep 2025 12:41:34 +0200
> > > From: Paolo Bonzini <pbonzini@redhat.com>
> > > Subject: [RFT PATCH v2 0/2] Fix cross migration issue with missing
> > >   features: pdcm, arch-capabilities
> > > X-Mailer: git-send-email 2.51.0
> > > 
> > > Add two compatibility properties to restore legacy behavior of machine types
> > > prior to QEMU 10.1.  Each of them addresses the two changes to CPUID:
> > > 
> > > - ARCH_CAPABILITIES should not be autoenabled when the CPU model specifies AMD
> > >    as the vendor
> > > 
> > > - specifying PDCM without PMU now causes an error, instead of being silently
> > >    dropped in cpu_x86_cpuid.
> > > 
> > > Note, I only tested this lightly.
> > 
> > Sorry for late.
> > 
> > I found the previous 2 fixes were merged into stable 10.0:
> > 
> > 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
> > 3d26cb65c27190e57637644ecf6c96b8c3d246a3
> > 
> > Should stable 10.0 revert these 2 fixes, to ensure migration
> > compatibility?

Sorry for late...just return from vacation.

> Now when I think about it.
> 
> There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
> with these 2 patches already.

EMM, it seems 10.0.x (x < 4) can't migrate to 10.0.y (4 <= y <= 5),
right? If so, could we treat this behavior as a regression?

> Reverting them in 10.0 will make
> 10.0 to be non-migratable with itself (10.0.5 can't be migrated
> to 10.0.6 if we'll release 10.0.6 with these 2 patches reverted).
> 
> Also, as far as I can see (and I asked about this some 5 times
> already, with no one answering - is it that difficult?) - we
> should pick this series (pdcm, arch-capabilities) to 10.1.x stable
> series too, since we can't migrate from previous versions to 10.1
> which has the two changes mentioned above.

I think so. in this series, Paolo added compat options in pc_compat_10_0
so it should be picked to stable v10.1.

> It looks to me - since the breakage is already done, and both 10.0
> and 10.1 is broken, we should declare the current situation as a
> status quo, and do the following:
> 
> 1. keep the above mentioned 24778b1c7ee7a and 3d26cb65c27190e5 in
>    10.0.x (instead of reverting them);
> 
> 2. pick up this 2 patches (fix cross migration issue with missing
>    pdcm, arch-capabilities) to 10.1.x (it should be done either way,
>    I think);

IIUC, if we picked current compat options to stable v10.1, then stable
v10.1 requires previous v10.0 sets the pdcm & arch-cap bits (i.e., do
not apply the fixes or revert the previous fix).

So it seems the reverts are unavoidable on v10.0?

(Let's see what Paolo and the other maintainers think.)

> 3. on top of these 2 "missing features: pdcm, arch-capabilities",
>    make the crossing line for before-10.0, not for before-10.1 series, -
>    ie, consider 10.0 *also* has these properties, but 9.2 and before
>    are not.
> 
> This too will make 10.0.5 => 10.0.6 non-migrateable, just like if
> I'll revert 24778b1c7ee7a and 3d26cb65c27190e5 in 10.0.  But this way
> we will also have these bugs fixed in 10.0.  And all subsequent
> versions of 10.0 and 10.1 will be migratable again.
> 
> Please, don't be quiet this time, - I need your comments for this
> matter, because I don't understand well enough how migration works.
> 
> Cc'ing Peter too, because I'm stuck here and no my questions are
> getting answered.. maybe he can help to at least clear some questions.

This issue is indeed quite tricky. Sometimes people (including myself)
assume that backporting fixes to the stable branch can avoid adding a
compat option. Now it seems the compat option is the better choice, as
users need to ensure migration rather than downtime before upgrading to
the stable version :-(.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-10-13  7:22     ` Zhao Liu
@ 2025-10-13 17:22       ` Michael Tokarev
  2025-10-14 10:49         ` Hector Cao
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Tokarev @ 2025-10-13 17:22 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, qemu-devel, hector.cao, lk, berrange,
	Peter Maydell, qemu-stable

On 10/13/25 10:22, Zhao Liu wrote:
> On Fri, Oct 10, 2025 at 08:40:56PM +0300, Michael Tokarev wrote:
..>>> I found the previous 2 fixes were merged into stable 10.0:
>>>
>>> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
>>> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
>>>
>>> Should stable 10.0 revert these 2 fixes, to ensure migration
>>> compatibility?
> 
> Sorry for late...just return from vacation.

I returned from vacation today too :)

>> Now when I think about it.
>>
>> There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
>> with these 2 patches already.
> 
> EMM, it seems 10.0.x (x < 4) can't migrate to 10.0.y (4 <= y <= 5),
> right? If so, could we treat this behavior as a regression?

It is a regression in 10.0.4 indeed.  But it already lasted for
2 stable releases (10.0.4 & 10.0.5).  So by reverting the above
mentioned two changes in 10.0.6, we'll make yet another regression,
now when migrating from 10.0.[45] to 10.0.6. This is why I thought
it might be an idea to keep just one regression in 10.0.x, so to
say.  Especially since these changes already fixes issues with
existing guests, so by reverting them, we'll bring them back to
10.0.x.

It is an either-or combination.  It is not bad either way, I'm just
thinking what is best currently.

And with my limited understanding of the migration issue in the context
(for which I asked for clarification some 5 or 6 times already), it
feels to me like "pretending" these above 2 mentioned above patches has
always been part of 10.0.x, - declare that migration wont work from
10.0.[1-3] (or [1-5]?) to subsequent versions, and be done with it.

And modify the 2 properties introduced by:

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

to be part of pc_compat_9_2 machine, not 10.0..

Hopefully it's understandable what I mean.

>> Reverting them in 10.0 will make
>> 10.0 to be non-migratable with itself (10.0.5 can't be migrated
>> to 10.0.6 if we'll release 10.0.6 with these 2 patches reverted).
>>
>> Also, as far as I can see (and I asked about this some 5 times
>> already, with no one answering - is it that difficult?) - we
>> should pick this series (pdcm, arch-capabilities) to 10.1.x stable
>> series too, since we can't migrate from previous versions to 10.1
>> which has the two changes mentioned above.
> 
> I think so. in this series, Paolo added compat options in pc_compat_10_0
> so it should be picked to stable v10.1.

Again, I asked about this some 5 times already, with no single
answer.

>> It looks to me - since the breakage is already done, and both 10.0
>> and 10.1 is broken, we should declare the current situation as a
>> status quo, and do the following:
>>
>> 1. keep the above mentioned 24778b1c7ee7a and 3d26cb65c27190e5 in
>>     10.0.x (instead of reverting them);
>>
>> 2. pick up this 2 patches (fix cross migration issue with missing
>>     pdcm, arch-capabilities) to 10.1.x (it should be done either way,
>>     I think);
> 
> IIUC, if we picked current compat options to stable v10.1, then stable
> v10.1 requires previous v10.0 sets the pdcm & arch-cap bits (i.e., do
> not apply the fixes or revert the previous fix).

Ugh.  Confusion++ :)  As you wrote yourself right above, "Paolo added
compat options in pc_compat_10_0, so it should be picked up to stable
10.1".  This point "2" is exactly this case I'm talking about.  Two
commits:

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

should be picked up for 10.1.x.

This "2" point is not (yet) about 10.0.x.


> So it seems the reverts are unavoidable on v10.0?
> 
> (Let's see what Paolo and the other maintainers think.)

For 10.0, there are 2 either-or options: either we revert, or we
pretend these has always been in 10.0.x and compensate, like I described
in my previous email in this thread (to which you're replying) and
re-describing now.

>> 3. on top of these 2 "missing features: pdcm, arch-capabilities",
>>     make the crossing line for before-10.0, not for before-10.1 series, -
>>     ie, consider 10.0 *also* has these properties, but 9.2 and before
>>     are not.

> This issue is indeed quite tricky. Sometimes people (including myself)
> assume that backporting fixes to the stable branch can avoid adding a
> compat option. Now it seems the compat option is the better choice, as
> users need to ensure migration rather than downtime before upgrading to
> the stable version :-(.
It's a good (hopefully) lesson for me myself, - I blindly picked up
a change which felt like an innocent (I even mentioned that in a commit
- it's a "cleanup patch") - just so a subsequent change in this area
applies cleanly.  But it wasn't a cleanup, and it wasn't trivial at
all.  So I must be much more careful the next time.  I'm talking about
3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM..".

Speaking of the other change - it fixed a real bug which I hit myself,
and I had no idea it's tricky - actually no one had this idea until
e9efa4a771 "property for arch_capabilities".  So yes, this is a "sh*t
happens" case :)


Ok.

So, back to the situation and the plan (two of them).


1. It looks like we agree we should pick

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

to 10.1.x, to make migration from older versions to 10.1.x work.


2.  For 10.0.x, we've two options:

  2.a.  Revert
     e9efa4a771 "do not expose ARCH_CAPABILITIES"
     3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM"
   as you initially suggested and already reviewed.

   This will make 10.0.[45] "bad" wrt migration, and will re-create the
   issues these 2 commits fixed, but will make next 10.0.x as good as
   initial 10.0.0 wrt migration.

  2.b.  Instead of reverting these two which are already in 10.0.[45],
   pretend 10.0 always had these 2 commits, and adjust subsequent
   qemu versions just like we did with 2 "add compatibility property"
   changes, but make it to be 9.2-compat property, not 10.0-compat
   property.

   This - as far as I can see - will make 10.0.[0-3] to be "bad" wrt
   migration, but not subsequent 10.0.x ones.  And will keep the bugs
   fixed in 10.0.x too.

But again, I don't understand the migration logic well, so don't know
if it even makes sense.  2.b, if deemed to be good, will be the first
in history (I think) to introduce compat properties for past machine
types.

Please excuse me for so much text :)

Thank you!

/mjt


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-10-13 17:22       ` Michael Tokarev
@ 2025-10-14 10:49         ` Hector Cao
  2025-10-14 13:53           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Hector Cao @ 2025-10-14 10:49 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Zhao Liu, Paolo Bonzini, qemu-devel, lk, berrange, Peter Maydell,
	qemu-stable

[-- Attachment #1: Type: text/plain, Size: 7978 bytes --]

On Mon, Oct 13, 2025 at 7:22 PM Michael Tokarev <mjt@tls.msk.ru> wrote:

> On 10/13/25 10:22, Zhao Liu wrote:
> > On Fri, Oct 10, 2025 at 08:40:56PM +0300, Michael Tokarev wrote:
> ..>>> I found the previous 2 fixes were merged into stable 10.0:
> >>>
> >>> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
> >>> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
> >>>
> >>> Should stable 10.0 revert these 2 fixes, to ensure migration
> >>> compatibility?
> >
> > Sorry for late...just return from vacation.
>
> I returned from vacation today too :)
>
> >> Now when I think about it.
> >>
> >> There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
> >> with these 2 patches already.
> >
> > EMM, it seems 10.0.x (x < 4) can't migrate to 10.0.y (4 <= y <= 5),
> > right? If so, could we treat this behavior as a regression?
>
> It is a regression in 10.0.4 indeed.  But it already lasted for
> 2 stable releases (10.0.4 & 10.0.5).  So by reverting the above
> mentioned two changes in 10.0.6, we'll make yet another regression,
> now when migrating from 10.0.[45] to 10.0.6. This is why I thought
> it might be an idea to keep just one regression in 10.0.x, so to
> say.  Especially since these changes already fixes issues with
> existing guests, so by reverting them, we'll bring them back to
> 10.0.x.
>
> It is an either-or combination.  It is not bad either way, I'm just
> thinking what is best currently.
>
> And with my limited understanding of the migration issue in the context
> (for which I asked for clarification some 5 or 6 times already), it
> feels to me like "pretending" these above 2 mentioned above patches has
> always been part of 10.0.x, - declare that migration wont work from
> 10.0.[1-3] (or [1-5]?) to subsequent versions, and be done with it.
>
> And modify the 2 properties introduced by:
>
> 6529f31e0d target/i386: add compatibility property for pdcm feature
> e9efa4a771 target/i386: add compatibility property for arch_capabilities
>
> to be part of pc_compat_9_2 machine, not 10.0..


> Hopefully it's understandable what I mean.
>
>
Hello Michael,

IIUC, there is no perfect solution that makes migration work for all
combinations
of versions as you already pointed out.
Reverting the two faulty commits in 10.0.x will reduce the scope of
migration failures (10.0.x -> 10.0.y / 10.1.z)

You seem to propose to backport the migration fixes (with compatibility
properties) back to 10.0.y
but I don't know if it is possible since only 10.0 machine type is
available.

Apply these compatibility properties only in 9.2 (and older) might make
sense IMHO since 10.0.y behaves the same way than 10.1

>> Reverting them in 10.0 will make
> >> 10.0 to be non-migratable with itself (10.0.5 can't be migrated
> >> to 10.0.6 if we'll release 10.0.6 with these 2 patches reverted).
> >>
> >> Also, as far as I can see (and I asked about this some 5 times
> >> already, with no one answering - is it that difficult?) - we
> >> should pick this series (pdcm, arch-capabilities) to 10.1.x stable
> >> series too, since we can't migrate from previous versions to 10.1
> >> which has the two changes mentioned above.
> >
> > I think so. in this series, Paolo added compat options in pc_compat_10_0
> > so it should be picked to stable v10.1.
>
> Again, I asked about this some 5 times already, with no single
> answer.
>
> >> It looks to me - since the breakage is already done, and both 10.0
> >> and 10.1 is broken, we should declare the current situation as a
> >> status quo, and do the following:
> >>
> >> 1. keep the above mentioned 24778b1c7ee7a and 3d26cb65c27190e5 in
> >>     10.0.x (instead of reverting them);
> >>
> >> 2. pick up this 2 patches (fix cross migration issue with missing
> >>     pdcm, arch-capabilities) to 10.1.x (it should be done either way,
> >>     I think);
> >
> > IIUC, if we picked current compat options to stable v10.1, then stable
> > v10.1 requires previous v10.0 sets the pdcm & arch-cap bits (i.e., do
> > not apply the fixes or revert the previous fix).
>
> Ugh.  Confusion++ :)  As you wrote yourself right above, "Paolo added
> compat options in pc_compat_10_0, so it should be picked up to stable
> 10.1".  This point "2" is exactly this case I'm talking about.  Two
> commits:
>
> 6529f31e0d target/i386: add compatibility property for pdcm feature
> e9efa4a771 target/i386: add compatibility property for arch_capabilities
>
> should be picked up for 10.1.x.
>
> This "2" point is not (yet) about 10.0.x.
>
>
> > So it seems the reverts are unavoidable on v10.0?
> >
> > (Let's see what Paolo and the other maintainers think.)
>
> For 10.0, there are 2 either-or options: either we revert, or we
> pretend these has always been in 10.0.x and compensate, like I described
> in my previous email in this thread (to which you're replying) and
> re-describing now.
>
> >> 3. on top of these 2 "missing features: pdcm, arch-capabilities",
> >>     make the crossing line for before-10.0, not for before-10.1 series,
> -
> >>     ie, consider 10.0 *also* has these properties, but 9.2 and before
> >>     are not.
>
> > This issue is indeed quite tricky. Sometimes people (including myself)
> > assume that backporting fixes to the stable branch can avoid adding a
> > compat option. Now it seems the compat option is the better choice, as
> > users need to ensure migration rather than downtime before upgrading to
> > the stable version :-(.
> It's a good (hopefully) lesson for me myself, - I blindly picked up
> a change which felt like an innocent (I even mentioned that in a commit
> - it's a "cleanup patch") - just so a subsequent change in this area
> applies cleanly.  But it wasn't a cleanup, and it wasn't trivial at
> all.  So I must be much more careful the next time.  I'm talking about
> 3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM..".
>
> Speaking of the other change - it fixed a real bug which I hit myself,
> and I had no idea it's tricky - actually no one had this idea until
> e9efa4a771 "property for arch_capabilities".  So yes, this is a "sh*t
> happens" case :)
>
>
> Ok.
>
> So, back to the situation and the plan (two of them).
>
>
> 1. It looks like we agree we should pick
>
> 6529f31e0d target/i386: add compatibility property for pdcm feature
> e9efa4a771 target/i386: add compatibility property for arch_capabilities
>
> to 10.1.x, to make migration from older versions to 10.1.x work.
>
>
> 2.  For 10.0.x, we've two options:
>
>   2.a.  Revert
>      e9efa4a771 "do not expose ARCH_CAPABILITIES"
>      3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM"
>    as you initially suggested and already reviewed.
>
>    This will make 10.0.[45] "bad" wrt migration, and will re-create the
>    issues these 2 commits fixed, but will make next 10.0.x as good as
>    initial 10.0.0 wrt migration.
>
>   2.b.  Instead of reverting these two which are already in 10.0.[45],
>    pretend 10.0 always had these 2 commits, and adjust subsequent
>    qemu versions just like we did with 2 "add compatibility property"
>    changes, but make it to be 9.2-compat property, not 10.0-compat
>    property.
>
>    This - as far as I can see - will make 10.0.[0-3] to be "bad" wrt
>    migration, but not subsequent 10.0.x ones.  And will keep the bugs
>    fixed in 10.0.x too.
>
> But again, I don't understand the migration logic well, so don't know
> if it even makes sense.  2.b, if deemed to be good, will be the first
> in history (I think) to introduce compat properties for past machine
> types.
>
> Please excuse me for so much text :)
>
> Thank you!
>
> /mjt
>


-- 
Hector CAO
Software Engineer – Server Team / Virtualization
hector.cao@canonical.com
https://launc <https://launchpad.net/~hectorcao>hpad.net/~hectorcao
<https://launchpad.net/~hectorcao>

<https://launchpad.net/~hectorcao>

[-- Attachment #2: Type: text/html, Size: 10391 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-10-14 10:49         ` Hector Cao
@ 2025-10-14 13:53           ` Paolo Bonzini
  2025-10-14 14:40             ` Michael Tokarev
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2025-10-14 13:53 UTC (permalink / raw)
  To: Hector Cao
  Cc: Michael Tokarev, Zhao Liu, qemu-devel, lk, berrange,
	Peter Maydell, qemu-stable

On Tue, Oct 14, 2025 at 12:49 PM Hector Cao <hector.cao@canonical.com> wrote:
>
>
>
> On Mon, Oct 13, 2025 at 7:22 PM Michael Tokarev <mjt@tls.msk.ru> wrote:
>>
>> On 10/13/25 10:22, Zhao Liu wrote:
>> > On Fri, Oct 10, 2025 at 08:40:56PM +0300, Michael Tokarev wrote:
>> ..>>> I found the previous 2 fixes were merged into stable 10.0:
>> >>>
>> >>> 24778b1c7ee7aca9721ed4757b0e0df0c16390f7
>> >>> 3d26cb65c27190e57637644ecf6c96b8c3d246a3
>> >>>
>> >>> Should stable 10.0 revert these 2 fixes, to ensure migration
>> >>> compatibility?
>> >
>> > Sorry for late...just return from vacation.
>>
>> I returned from vacation today too :)
>>
>> >> Now when I think about it.
>> >>
>> >> There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
>> >> with these 2 patches already.
>> >
>> > EMM, it seems 10.0.x (x < 4) can't migrate to 10.0.y (4 <= y <= 5),
>> > right? If so, could we treat this behavior as a regression?
>>
>> It is a regression in 10.0.4 indeed.  But it already lasted for
>> 2 stable releases (10.0.4 & 10.0.5).  So by reverting the above
>> mentioned two changes in 10.0.6, we'll make yet another regression,
>> now when migrating from 10.0.[45] to 10.0.6. This is why I thought
>> it might be an idea to keep just one regression in 10.0.x, so to
>> say.  Especially since these changes already fixes issues with
>> existing guests, so by reverting them, we'll bring them back to
>> 10.0.x.
>>
>> It is an either-or combination.  It is not bad either way, I'm just
>> thinking what is best currently.
>>
>> And with my limited understanding of the migration issue in the context
>> (for which I asked for clarification some 5 or 6 times already), it
>> feels to me like "pretending" these above 2 mentioned above patches has
>> always been part of 10.0.x, - declare that migration wont work from
>> 10.0.[1-3] (or [1-5]?) to subsequent versions, and be done with it.
>>
>> And modify the 2 properties introduced by:
>>
>> 6529f31e0d target/i386: add compatibility property for pdcm feature
>> e9efa4a771 target/i386: add compatibility property for arch_capabilities
>>
>> to be part of pc_compat_9_2 machine, not 10.0..
>>
>> Hopefully it's understandable what I mean.
>
> IIUC, there is no perfect solution that makes migration work for all combinations
> of versions as you already pointed out.
> Reverting the two faulty commits in 10.0.x will reduce the scope of migration failures (10.0.x -> 10.0.y / 10.1.z)

Yes, I agree. In my opinion reverting is the best option, because it
makes the machine types as constant as possible. Any change in the
machine types is a bug and the fix is to revert to the previous
situation.

Paolo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
  2025-10-14 13:53           ` Paolo Bonzini
@ 2025-10-14 14:40             ` Michael Tokarev
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Tokarev @ 2025-10-14 14:40 UTC (permalink / raw)
  To: Paolo Bonzini, Hector Cao
  Cc: Zhao Liu, qemu-devel, lk, berrange, Peter Maydell, qemu-stable

On 10/14/25 16:53, Paolo Bonzini wrote:
> On Tue, Oct 14, 2025 at 12:49 PM Hector Cao <hector.cao@canonical.com> wrote:

>> Reverting the two faulty commits in 10.0.x will reduce the scope of migration failures (10.0.x -> 10.0.y / 10.1.z)
> 
> Yes, I agree. In my opinion reverting is the best option, because it
> makes the machine types as constant as possible. Any change in the
> machine types is a bug and the fix is to revert to the previous
> situation.

Ok, let's just revert them, despite the fact it's been two stable
10.0.x releases already.

How exactly migration fails, when it fails, and what's our
breakage matrix?

   <=10.0.3
     10.0.4,10.0.5
   >=10.0.6 with reverts (I plan to make this release ASAP)

I guess migration between <=10.0.3 and >=10.0.6 will be just fine.

also, it looks like 10.1.2 should be released ASAP too, with the fixes
on top picked up from the master branch.

Let's plan two new stable releases (10.0.6 & 10.1.2) for the next week.

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-10-14 14:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-23 10:41 [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Paolo Bonzini
2025-09-23 10:41 ` [RFT PATCH v2 1/2] target/i386: add compatibility property for arch_capabilities Paolo Bonzini
2025-09-25 16:09   ` Zhao Liu
2025-09-23 10:41 ` [RFT PATCH v2 2/2] target/i386: add compatibility property for pdcm feature Paolo Bonzini
2025-09-25 16:10   ` Zhao Liu
2025-09-25 16:17 ` [RFT PATCH v2 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Zhao Liu
2025-09-28  9:41   ` Paolo Bonzini
2025-10-08  8:47     ` Michael Tokarev
2025-10-08 13:32   ` Michael Tokarev
2025-10-10 17:40   ` Michael Tokarev
2025-10-13  7:22     ` Zhao Liu
2025-10-13 17:22       ` Michael Tokarev
2025-10-14 10:49         ` Hector Cao
2025-10-14 13:53           ` Paolo Bonzini
2025-10-14 14:40             ` Michael Tokarev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).