* [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
@ 2025-09-12 13:35 Hector Cao
0 siblings, 0 replies; 6+ messages in thread
From: Hector Cao @ 2025-09-12 13:35 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Zhao Liu, peterx, farosas
[-- Attachment #1: Type: text/plain, Size: 230 bytes --]
Thanks Fiona Ebner for pointing out (in DM) that I did not CC to the
relevant maintainers.
Let me CC to maintainers that are listed by the ./scripts/get_maintainer.pl
script on the submission changed files.
Kind regards,
Hector
[-- Attachment #2: Type: text/html, Size: 513 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Issues with pdcm in qemu 10.1-rc on migration and save/restore
@ 2025-09-04 14:35 Hector Cao
2025-09-10 11:57 ` [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Hector Cao
0 siblings, 1 reply; 6+ messages in thread
From: Hector Cao @ 2025-09-04 14:35 UTC (permalink / raw)
To: Christian Ehrhardt
Cc: Paolo Bonzini, Daniel P. Berrangé, Xiaoyao Li, Zhao Liu,
qemu-devel
[-- Attachment #1: Type: text/plain, Size: 9654 bytes --]
Hello,
In addition to my previous mail describing the issue on different
Ubuntu releases,
I went further by testing directly qemu upstream at HEAD
(baa79455fa92984ff0f4b9ae94bed66823177a27)
As the start version for the migration, I take quite recent release
v10.0.x to make the version gap smaller.
I can reproduce the following migration failures:
v10.0.2 -> HEAD:
error: operation failed: guest CPU doesn't match specification:
missing features: pdcm,arch-capabilities
v10.0.3 -> HEAD:
error: operation failed: guest CPU doesn't match specification:
missing features: pdcm
The error arch-capabilities is no longer present because v10.0.3 also
has [2] like HEAD.
If I revert the two commits [1] and [2] in HEAD, the migration works fine:
v10.0.2 -> HEAD (+reverts):
OK
[1] Revert "i386/cpu: Move adjustment of CPUID_EXT_PDCM before
feature_dependencies[] check"
This reverts commit e68ec2980901c8e7f948f3305770962806c53f0b.
[2] Revert "target/i386: do not expose ARCH_CAPABILITIES on AMD CPU"
This reverts commit d3a24134e37d57abd3e7445842cda2717f49e96d.
Since this issue is blocking us for the Ubuntu 25.10 release, can you
please provide
feedback on the best path going forward ?
On Wed, Sep 3, 2025 at 10:38 AM Christian Ehrhardt <
christian.ehrhardt@canonical.com> wrote:
> On Wed, Aug 20, 2025 at 7:11 AM Christian Ehrhardt
> <christian.ehrhardt@canonical.com> wrote:
> >
> > On Tue, Aug 19, 2025 at 4:51 PM Paolo Bonzini <pbonzini@redhat.com>
> wrote:
> > >
> > > On 8/6/25 21:18, Daniel P. Berrangé wrote:
> > > > On Wed, Aug 06, 2025 at 07:57:34PM +0200, Christian Ehrhardt wrote:
> > > >> On Wed, Aug 6, 2025 at 2:00 PM Daniel P. Berrangé <
> berrange@redhat.com> wrote:
> > > >>>
> > > >>> On Wed, Aug 06, 2025 at 01:52:17PM +0200, Christian Ehrhardt wrote:
> > > >>>> Hi,
> > > >>>> I was unsure if this would be better sent to libvirt or qemu - the
> > > >>>> issue is somewhere between libvirt modelling CPUs and qemu 10.1
> > > >>>> behaving differently. I did not want to double post and gladly
> most of
> > > >>>> the people are on both lists - since the switch in/out of the
> problem
> > > >>>> is qemu 10.0 <-> 10.1 let me start here. I beg your pardon for
> not yet
> > > >>>> having all the answers, I'm sure I could find more with
> debugging, but
> > > >>>> I also wanted to report early for your awareness while we are
> still in
> > > >>>> the RC phase.
> > > >>>>
> > > >>>>
> > > >>>> # Problem
> > > >>>>
> > > >>>> What I found when testing migrations in Ubuntu with qemu 10.1-rc1
> was:
> > > >>>> error: operation failed: guest CPU doesn't match specification:
> > > >>>> missing features: pdcm
> > > >>>>
> > > >>>> This is behaving the same with libvirt 11.4 or the more recent
> 11.6.
> > > >>>> But switching back to qemu 10.0 confirmed that this behavior is
> new
> > > >>>> with qemu 10.1-rc.
> > > >>>
> > > >>>
> > > >>>> Without yet having any hard evidence against them I found a few
> pdcm
> > > >>>> related commits between 10.0 and 10.1-rc1:
> > > >>>> 7ff24fb65 i386/tdx: Don't mask off CPUID_EXT_PDCM
> > > >>>> 00268e000 i386/cpu: Warn about why CPUID_EXT_PDCM is not
> available
> > > >>>> e68ec2980 i386/cpu: Move adjustment of CPUID_EXT_PDCM before
> > > >>>> feature_dependencies[] check
> > > >>>> 0ba06e46d i386/tdx: Add TDX fixed1 bits to supported CPUIDs
> > > >>>>
> > > >>>>
> > > >>>> # Caveat
> > > >>>>
> > > >>>> My test environment is in LXD system containers, that gives me
> issues
> > > >>>> in the power management detection
> > > >>>> libvirtd[406]: error from service:
> GDBus.Error:System.Error.EROFS:
> > > >>>> Read-only file system
> > > >>>> libvirtd[406]: Failed to get host power management capabilities
> > > >>>
> > > >>> That's harmless.
> > > >>
> > > >> Yeah, it always was for me - thanks for confirming.
> > > >>
> > > >>>> And the resulting host-model on a rather old test server will
> therefore have:
> > > >>>> <cpu mode='custom' match='exact' check='full'>
> > > >>>> <model fallback='forbid'>Haswell-noTSX-IBRS</model>
> > > >>>> <vendor>Intel</vendor>
> > > >>>> <feature policy='require' name='vmx'/>
> > > >>>> <feature policy='disable' name='pdcm'/>
> > > >>>> ...
> > > >>>>
> > > >>>> But that was fine in the past, and the behavior started to break
> > > >>>> save/restore or migrations just now with the new qemu 10.1-rc.
> > > >>>>
> > > >>>> # Next steps
> > > >>>>
> > > >>>> I'm soon overwhelmed by meetings for the rest of the day, but
> would be
> > > >>>> curious if one has a suggestion about what to look at next for
> > > >>>> debugging or a theory about what might go wrong. If nothing else
> comes
> > > >>>> up I'll try to set up a bisect run tomorrow.
> > > >>>
> > > >>> Yeah, git bisect is what I'd start with.
> > > >>
> > > >> Bisect complete, identified this commit
> > > >>
> > > >> commit 00268e00027459abede448662f8794d78eb4b0a4
> > > >> Author: Xiaoyao Li <xiaoyao.li@intel.com>
> > > >> Date: Tue Mar 4 00:24:50 2025 -0500
> > > >>
> > > >> i386/cpu: Warn about why CPUID_EXT_PDCM is not available
> > > >>
> > > >> When user requests PDCM explicitly via "+pdcm" without PMU
> enabled, emit
> > > >> a warning to inform the user.
> > > >>
> > > >> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > >> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> > > >> Link:
> https://lore.kernel.org/r/20250304052450.465445-3-xiaoyao.li@intel.com
> > > >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > > >>
> > > >> target/i386/cpu.c | 3 +++
> > > >> 1 file changed, 3 insertions(+)
> > > >>
> > > >>
> > > >>
> > > >> Which is odd as it should only add a warning right?
> > > >
> > > > No, that commit message is misleading.
> > > >
> > > > IIUC mark_unavailable_features() actively blocks usage of the
> feature,
> > > > so it is a functional change, not merely a emitting warning.
> > > >
> > > > It makes me wonder if that commit was actually intended to block the
> > > > feature or not, vs merely warning ? CC'ing those involved in the
> > > > commit.
> > > We can revert the commit. I'll send the revert to Stefan and let him
> > > decide whether to include it in 10.1-rc4 or delay to 10.2 and 10.1.1.
> >
> > Thanks Paolo for considering that.
> >
> > My steps to reproduce seemed really clear and are 100% reproducible
> > for me, but no one so far said "yeah they see it too", so I'm getting
> > unsure if it was not tried by anyone else or if there is more to it
> > than we yet know.
> > Further I tested more with the commit reverted, and found that at
> > least cross version migrations (9.2 -> 10.1) still have issues that
> > seem related - complaining about pdcm as missing feature.
> > But that was in a log of a test system that went away and ... you know
> > how these things can sometimes be, that new result is not yet very
> > reliable.
> >
> > I intended to check the following matrix more deeply again with and
> > without the reverted change and then come back to this thread:
> >
> > #1 Compare platforms
> > - Migrating between non containerized hosts to verify if they are
> > affected as well
> > - Power management explicitly switched off/on (vs the auto detect of
> > host-model) in the guest XML
> > #2 Retest the different Use-cases I've seen this pop up
> > - 10.1 managed save (broken unless reverting the commit that was
> identified)
> > - 9.2 -> 10.1 migration (seems broken even with the revert)
>
> I need to come back to this aspect of it - the cross release or cross
> qemu version migrations.
>
> Hector (on CC) helps me on that now - sadly we were able to confirm
> that migrations from older qemu versions no longer work.
> Yep 10.1 is released by now so it might end up as "The problem is what
> happens when we detect after we have done a release that something has
> gone wrong" from [2].
> But I still can't believe only we see this and therefore for now want
> to believe I messed up on our side when merging 10.1 :-)
>
> For now this is a call if others have also seen any older release
> migrating to 10.1 to throw:
> error: operation failed: guest CPU doesn't match specification:
> missing features: pdcm,arch-capabilities
>
> Hector will later today reply here with a summary of what we found so
> far, to provide you a more complete picture to think about, without
> having to read through all the messy interim steps in the Ubuntu bug.
>
> [1]: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2121787
> [2]:
> https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/compatibility.rst?plain=1#L322
>
> > The hope was that these will help to further identify what is going
> > on, but despite the urgency of the release being imminent I have not
> > yet managed to find the time in the last two days :-/
> >
> > > Sorry for the delay in answering (and thanks Daniel for bringing this
> to
> > > my attention).
> > >
> > > Thanks,
> > >
> > > Paolo
> > >
> >
> >
> > --
> > Christian Ehrhardt
> > Director of Engineering, Ubuntu Server
> > Canonical Ltd
>
>
>
> --
> Christian Ehrhardt
> Director of Engineering, Ubuntu Server
> Canonical Ltd
>
--
Hector CAO
Software Engineer – Partner Engineering Team
hector.cao@canonical.com
https://launc <https://launchpad.net/~hectorcao>hpad.net/~hectorcao
<https://launchpad.net/~hectorcao>
<https://launchpad.net/~hectorcao>
[-- Attachment #2: Type: text/html, Size: 13864 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread* [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
2025-09-04 14:35 Issues with pdcm in qemu 10.1-rc on migration and save/restore Hector Cao
@ 2025-09-10 11:57 ` Hector Cao
2025-09-23 7:53 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Hector Cao @ 2025-09-10 11:57 UTC (permalink / raw)
To: qemu-devel
Hello,
Since it is a blocking issue for us, we went further and ended up with a solution along [1]
that allows us to get out of this situation.
The idea is to add compatibility properties to restore legacy behaviors for machine types
with older versions of QEMU (<10.1). 2 compatiblity properties have been added to address
respectively the 2 missing features, each one is done in a separate patch.
We know that 10.1 has been released and it's final, but working on a solution towards 11.0
would allow everyone to settle on the fix and even consider backporting where not yet released
like Ubuntu 25.10 for us.
It is important to have upstream support going forward in this or any other way
and therefore reach out with this RFC to ask you to think about it with us.
[1] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/compatibility.rst
Hector Cao (2):
target/i386: add compatibility property for arch_capabilities
target/i386: add compatibility property for pdcm feature
hw/core/machine.c | 2 ++
migration/migration.h | 23 +++++++++++++++++++++++
migration/options.c | 6 ++++++
target/i386/cpu.c | 17 ++++++++++++++---
target/i386/kvm/kvm.c | 5 ++++-
5 files changed, 49 insertions(+), 4 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
2025-09-10 11:57 ` [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Hector Cao
@ 2025-09-23 7:53 ` Paolo Bonzini
2025-09-23 10:08 ` Hector Cao
0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2025-09-23 7:53 UTC (permalink / raw)
To: Hector Cao, qemu-devel
On 9/10/25 13:57, Hector Cao wrote:
> Hello,
>
> Since it is a blocking issue for us, we went further and ended up with a solution along [1]
> that allows us to get out of this situation.
>
> The idea is to add compatibility properties to restore legacy behaviors for machine types
> with older versions of QEMU (<10.1). 2 compatiblity properties have been added to address
> respectively the 2 missing features, each one is done in a separate patch.
>
> We know that 10.1 has been released and it's final, but working on a solution towards 11.0
> would allow everyone to settle on the fix and even consider backporting where not yet released
> like Ubuntu 25.10 for us.
Thanks, I have applied the patch. It's better to have the fix in 10.1.1.
Sorry for the delay, I was on vacation for one week and working reduced
hours the next.
Paolo
> It is important to have upstream support going forward in this or any other way
> and therefore reach out with this RFC to ask you to think about it with us.
>
> [1] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/compatibility.rst
>
> Hector Cao (2):
> target/i386: add compatibility property for arch_capabilities
> target/i386: add compatibility property for pdcm feature
>
> hw/core/machine.c | 2 ++
> migration/migration.h | 23 +++++++++++++++++++++++
> migration/options.c | 6 ++++++
> target/i386/cpu.c | 17 ++++++++++++++---
> target/i386/kvm/kvm.c | 5 ++++-
> 5 files changed, 49 insertions(+), 4 deletions(-)
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
2025-09-23 7:53 ` Paolo Bonzini
@ 2025-09-23 10:08 ` Hector Cao
2025-09-23 10:15 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Hector Cao @ 2025-09-23 10:08 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 2029 bytes --]
Thanks Paolo,
Is it still time for me to submit the v2 of this patch ? I would like do
add 2 changes:
- add fixes:xxx line suggested by Daniel
- fix link error for qemu-user build (since it has no access to migration
code)
Best,
Hector
<https://launchpad.net/~hectorcao>
Le mar. 23 sept. 2025, 09:53, Paolo Bonzini <pbonzini@redhat.com> a écrit :
> On 9/10/25 13:57, Hector Cao wrote:
> > Hello,
> >
> > Since it is a blocking issue for us, we went further and ended up with a
> solution along [1]
> > that allows us to get out of this situation.
> >
> > The idea is to add compatibility properties to restore legacy behaviors
> for machine types
> > with older versions of QEMU (<10.1). 2 compatiblity properties have been
> added to address
> > respectively the 2 missing features, each one is done in a separate
> patch.
> >
> > We know that 10.1 has been released and it's final, but working on a
> solution towards 11.0
> > would allow everyone to settle on the fix and even consider backporting
> where not yet released
> > like Ubuntu 25.10 for us.
>
> Thanks, I have applied the patch. It's better to have the fix in 10.1.1.
>
> Sorry for the delay, I was on vacation for one week and working reduced
> hours the next.
>
> Paolo
>
> > It is important to have upstream support going forward in this or any
> other way
> > and therefore reach out with this RFC to ask you to think about it with
> us.
> >
> > [1]
> https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/compatibility.rst
> >
> > Hector Cao (2):
> > target/i386: add compatibility property for arch_capabilities
> > target/i386: add compatibility property for pdcm feature
> >
> > hw/core/machine.c | 2 ++
> > migration/migration.h | 23 +++++++++++++++++++++++
> > migration/options.c | 6 ++++++
> > target/i386/cpu.c | 17 ++++++++++++++---
> > target/i386/kvm/kvm.c | 5 ++++-
> > 5 files changed, 49 insertions(+), 4 deletions(-)
> >
>
>
[-- Attachment #2: Type: text/html, Size: 3127 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
2025-09-23 10:08 ` Hector Cao
@ 2025-09-23 10:15 ` Paolo Bonzini
2025-09-23 10:31 ` Hector Cao
0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2025-09-23 10:15 UTC (permalink / raw)
To: Hector Cao; +Cc: qemu-devel
On Tue, Sep 23, 2025 at 12:08 PM Hector Cao <hector.cao@canonical.com> wrote:
>
> Thanks Paolo,
>
> Is it still time for me to submit the v2 of this patch ? I would like do add 2 changes:
> - add fixes:xxx line suggested by Daniel
> - fix link error for qemu-user build (since it has no access to migration code)
I have since noticed the link error indeed, and I'll post a v2 myself
with the fix.
Next time, if you notice a problem with the patch you should post the
fixed version without waiting for input.
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities
2025-09-23 10:15 ` Paolo Bonzini
@ 2025-09-23 10:31 ` Hector Cao
0 siblings, 0 replies; 6+ messages in thread
From: Hector Cao @ 2025-09-23 10:31 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 736 bytes --]
<https://launchpad.net/~hectorcao>
Le mar. 23 sept. 2025, 12:15, Paolo Bonzini <pbonzini@redhat.com> a écrit :
> On Tue, Sep 23, 2025 at 12:08 PM Hector Cao <hector.cao@canonical.com>
> wrote:
> >
> > Thanks Paolo,
> >
> > Is it still time for me to submit the v2 of this patch ? I would like do
> add 2 changes:
> > - add fixes:xxx line suggested by Daniel
> > - fix link error for qemu-user build (since it has no access to
> migration code)
>
> I have since noticed the link error indeed, and I'll post a v2 myself
> with the fix.
>
> Next time, if you notice a problem with the patch you should post the
> fixed version without waiting for input.
>
Lesson learnt, thanks !
Hector
>
> Paolo
>
>
[-- Attachment #2: Type: text/html, Size: 1803 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-09-23 10:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12 13:35 [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Hector Cao
-- strict thread matches above, loose matches on Subject: below --
2025-09-04 14:35 Issues with pdcm in qemu 10.1-rc on migration and save/restore Hector Cao
2025-09-10 11:57 ` [RFC PATCH 0/2] Fix cross migration issue with missing features: pdcm, arch-capabilities Hector Cao
2025-09-23 7:53 ` Paolo Bonzini
2025-09-23 10:08 ` Hector Cao
2025-09-23 10:15 ` Paolo Bonzini
2025-09-23 10:31 ` Hector Cao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).