* [PATCH] Documentation: KVM: Document guest-visible compatibility expectations @ 2026-05-11 8:57 David Woodhouse 2026-05-11 15:14 ` Paolo Bonzini 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-11 8:57 UTC (permalink / raw) To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 4433 bytes --] From: David Woodhouse <dwmw@amazon.co.uk> Document the expectation that KVM maintains guest-visible compatibility across host kernel upgrades and rollbacks. Specifically: - State saved/restored via KVM ioctls must be sufficient for live migration (and live update) between kernel versions. - Where a new kernel introduces a guest-visible change, it provides a mechanism for userspace to select the previous behaviour. - This allows both forward migration (upgrade) and backward migration (rollback) of guests. These expectations have been implicitly required on x86 but were not explicitly documented. Harmonise the expectations across all of KVM. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- Documentation/virt/kvm/api.rst | 14 ++++++++++++++ Documentation/virt/kvm/review-checklist.rst | 20 ++++++++++++++------ 2 files changed, 28 insertions(+), 6 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 269970221797..864f3daa7acb 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -97,6 +97,20 @@ Instead, kvm defines extension identifiers and a facility to query whether a particular extension identifier is available. If it is, a set of ioctls is available for application use. +KVM will ensure that the state that can be saved and restored via the +KVM ioctls is sufficient to allow migration of a running guest between +host kernels while maintaining full compatibility of the guest-visible +device model. This includes migration to newer kernels (upgrade) and +to older kernels (rollback), provided that the older kernel supports +the set of features exposed to the guest. Where a new kernel version +introduces a guest-visible change, it will provide a mechanism (such +as a capability or a device attribute) that allows userspace to select +the previous behaviour. This serves two purposes: guests migrated +from an older kernel can continue to run with their original +observable environment, and new guests launched on the newer kernel +can be configured to match the feature set of the older kernel, so +that they remain migratable to the older kernel in case of rollback. + 4. API description ================== diff --git a/Documentation/virt/kvm/review-checklist.rst b/Documentation/virt/kvm/review-checklist.rst index 053f00c50d66..f0fbe1577a90 100644 --- a/Documentation/virt/kvm/review-checklist.rst +++ b/Documentation/virt/kvm/review-checklist.rst @@ -18,22 +18,30 @@ Review checklist for kvm patches 5. New features must default to off (userspace should explicitly request them). Performance improvements can and should default to on. -6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2, +6. Guest-visible changes must not break migration compatibility. A guest + migrated from an older kernel must be able to run with its original + observable environment, and a guest launched on a newer kernel must be + configurable to match the older kernel's feature set for rollback. + Where a change alters guest-visible behaviour, provide a mechanism + (capability, device attribute, etc.) for userspace to select the + previous behaviour. + +7. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2, or its equivalent for non-x86 architectures -7. The feature should be testable (see below). +8. The feature should be testable (see below). -8. Changes should be vendor neutral when possible. Changes to common code +9. Changes should be vendor neutral when possible. Changes to common code are better than duplicating changes to vendor code. -9. Similarly, prefer changes to arch independent code than to arch dependent +10. Similarly, prefer changes to arch independent code than to arch dependent code. -10. User/kernel interfaces and guest/host interfaces must be 64-bit clean +11. User/kernel interfaces and guest/host interfaces must be 64-bit clean (all variables and sizes naturally aligned on 64-bit; use specific types only - u64 rather than ulong). -11. New guest visible features must either be documented in a hardware manual +12. New guest visible features must either be documented in a hardware manual or be accompanied by documentation. Testing of KVM code -- 2.43.0 [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-11 8:57 [PATCH] Documentation: KVM: Document guest-visible compatibility expectations David Woodhouse @ 2026-05-11 15:14 ` Paolo Bonzini 2026-05-11 16:38 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Paolo Bonzini @ 2026-05-11 15:14 UTC (permalink / raw) To: David Woodhouse, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On 5/11/26 10:57, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Document the expectation that KVM maintains guest-visible compatibility > across host kernel upgrades and rollbacks. Specifically: > > - State saved/restored via KVM ioctls must be sufficient for live > migration (and live update) between kernel versions. > > - Where a new kernel introduces a guest-visible change, it provides a > mechanism for userspace to select the previous behaviour. > > - This allows both forward migration (upgrade) and backward migration > (rollback) of guests. > > These expectations have been implicitly required on x86 but were not > explicitly documented. Harmonise the expectations across all of KVM. One big part of achieving this on x86 is the handling of CPUID. Despite all the mess that KVM_SET_CPUID2 is (and sometimes the underlying architecture too, as Jim Mattson would certainly agree), KVM is generally able to provide a consistent view of its configuration to the guest. This doesn't quite extend to compatibility across vendors, but it does work across processor generations from either Intel or AMD. I understand that Arm traditionally had much more trouble than x86 with vendor-specified behavior that goes beyond the set of architectural features, so we may need to tune the expectations. However, I agree with David that this is needed at least as long as the host CPU does not change. Thanks, Paolo > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > Documentation/virt/kvm/api.rst | 14 ++++++++++++++ > Documentation/virt/kvm/review-checklist.rst | 20 ++++++++++++++------ > 2 files changed, 28 insertions(+), 6 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 269970221797..864f3daa7acb 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -97,6 +97,20 @@ Instead, kvm defines extension identifiers and a facility to query > whether a particular extension identifier is available. If it is, a > set of ioctls is available for application use. > > +KVM will ensure that the state that can be saved and restored via the > +KVM ioctls is sufficient to allow migration of a running guest between > +host kernels while maintaining full compatibility of the guest-visible > +device model. This includes migration to newer kernels (upgrade) and > +to older kernels (rollback), provided that the older kernel supports > +the set of features exposed to the guest. Where a new kernel version > +introduces a guest-visible change, it will provide a mechanism (such > +as a capability or a device attribute) that allows userspace to select > +the previous behaviour. This serves two purposes: guests migrated > +from an older kernel can continue to run with their original > +observable environment, and new guests launched on the newer kernel > +can be configured to match the feature set of the older kernel, so > +that they remain migratable to the older kernel in case of rollback. > + > > 4. API description > ================== > diff --git a/Documentation/virt/kvm/review-checklist.rst b/Documentation/virt/kvm/review-checklist.rst > index 053f00c50d66..f0fbe1577a90 100644 > --- a/Documentation/virt/kvm/review-checklist.rst > +++ b/Documentation/virt/kvm/review-checklist.rst > @@ -18,22 +18,30 @@ Review checklist for kvm patches > 5. New features must default to off (userspace should explicitly request them). > Performance improvements can and should default to on. > > -6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2, > +6. Guest-visible changes must not break migration compatibility. A guest > + migrated from an older kernel must be able to run with its original > + observable environment, and a guest launched on a newer kernel must be > + configurable to match the older kernel's feature set for rollback. > + Where a change alters guest-visible behaviour, provide a mechanism > + (capability, device attribute, etc.) for userspace to select the > + previous behaviour. > + > +7. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2, > or its equivalent for non-x86 architectures > > -7. The feature should be testable (see below). > +8. The feature should be testable (see below). > > -8. Changes should be vendor neutral when possible. Changes to common code > +9. Changes should be vendor neutral when possible. Changes to common code > are better than duplicating changes to vendor code. > > -9. Similarly, prefer changes to arch independent code than to arch dependent > +10. Similarly, prefer changes to arch independent code than to arch dependent > code. > > -10. User/kernel interfaces and guest/host interfaces must be 64-bit clean > +11. User/kernel interfaces and guest/host interfaces must be 64-bit clean > (all variables and sizes naturally aligned on 64-bit; use specific types > only - u64 rather than ulong). > > -11. New guest visible features must either be documented in a hardware manual > +12. New guest visible features must either be documented in a hardware manual > or be accompanied by documentation. > > Testing of KVM code ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-11 15:14 ` Paolo Bonzini @ 2026-05-11 16:38 ` David Woodhouse 2026-05-11 16:56 ` Paolo Bonzini 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-11 16:38 UTC (permalink / raw) To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 2432 bytes --] On Mon, 2026-05-11 at 17:14 +0200, Paolo Bonzini wrote: > On 5/11/26 10:57, David Woodhouse wrote: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > Document the expectation that KVM maintains guest-visible compatibility > > across host kernel upgrades and rollbacks. Specifically: > > > > - State saved/restored via KVM ioctls must be sufficient for live > > migration (and live update) between kernel versions. > > > > - Where a new kernel introduces a guest-visible change, it provides a > > mechanism for userspace to select the previous behaviour. > > > > - This allows both forward migration (upgrade) and backward migration > > (rollback) of guests. > > > > These expectations have been implicitly required on x86 but were not > > explicitly documented. Harmonise the expectations across all of KVM. > > One big part of achieving this on x86 is the handling of CPUID. Despite > all the mess that KVM_SET_CPUID2 is (and sometimes the underlying > architecture too, as Jim Mattson would certainly agree), KVM is > generally able to provide a consistent view of its configuration to the > guest. This doesn't quite extend to compatibility across vendors, but > it does work across processor generations from either Intel or AMD. Right. For x86 this is largely covered by CPUID. If you launch a guest on a new kernel, using the same CPUID bits as an older kernel, then your guest will mostly not see anything new. And will be migratable to that older kernel without issue. Not *everything* is in CPUID; one recent exception that comes to mind is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the existing behaviour of older kernels — even when that behaviour doesn't make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would *advertise* the feature, but not actually *implement* it. Nevertheless, that remains the default behaviour of future kernels unless userspace explicitly opts in to fully enable (or disable) the feature. But this documentation update isn't even asking for that compatible-by- default behaviour, even though that is the right thing to do. It's only asking that it be *possible* to reinstate the old behaviour, for userspace that *knows* about the change and explicitly wants to go back to the old way to remain compatible. And sadly, KVM/arm64 doesn't even meet *that* low bar. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-11 16:38 ` David Woodhouse @ 2026-05-11 16:56 ` Paolo Bonzini 2026-05-11 17:53 ` David Woodhouse 2026-05-13 8:42 ` Marc Zyngier 0 siblings, 2 replies; 29+ messages in thread From: Paolo Bonzini @ 2026-05-11 16:56 UTC (permalink / raw) To: David Woodhouse, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Marc Zyngier Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On 5/11/26 18:38, David Woodhouse wrote: > Not *everything* is in CPUID; one recent exception that comes to mind > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the > existing behaviour of older kernels — even when that behaviour doesn't > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would > *advertise* the feature, but not actually *implement* it. Nevertheless, > that remains the default behaviour of future kernels unless userspace > explicitly opts in to fully enable (or disable) the feature. > > But this documentation update isn't even asking for that compatible-by- > default behaviour, even though that is the right thing to do. It's only > asking that it be *possible* to reinstate the old behaviour, for > userspace that *knows* about the change and explicitly wants to go back > to the old way to remain compatible. Yep, these are the "quirks"---if it's too early for Arm to commit to that, I guess it's fine. However, independent of this patch which I (obviously) believe is a good idea, I'd like to understand how far it is, assuming 1) no quirks 2) same CPU host. By the way, you didn't Cc Marc... Paolo ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-11 16:56 ` Paolo Bonzini @ 2026-05-11 17:53 ` David Woodhouse 2026-05-13 8:42 ` Marc Zyngier 1 sibling, 0 replies; 29+ messages in thread From: David Woodhouse @ 2026-05-11 17:53 UTC (permalink / raw) To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Marc Zyngier Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 2352 bytes --] On Mon, 2026-05-11 at 18:56 +0200, Paolo Bonzini wrote: > On 5/11/26 18:38, David Woodhouse wrote: > > Not *everything* is in CPUID; one recent exception that comes to mind > > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the > > existing behaviour of older kernels — even when that behaviour doesn't > > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would > > *advertise* the feature, but not actually *implement* it. Nevertheless, > > that remains the default behaviour of future kernels unless userspace > > explicitly opts in to fully enable (or disable) the feature. > > > > But this documentation update isn't even asking for that compatible-by- > > default behaviour, even though that is the right thing to do. It's only > > asking that it be *possible* to reinstate the old behaviour, for > > userspace that *knows* about the change and explicitly wants to go back > > to the old way to remain compatible. > > Yep, these are the "quirks"---if it's too early for Arm to commit to > that, I guess it's fine. > > However, independent of this patch which I (obviously) believe is a good > idea, I'd like to understand how far it is, assuming 1) no quirks 2) > same CPU host. It generally works out on arm64, although it's obviously a lot more work than x86 which makes an effort to get this stuff right. When we upgrade the kernel we do a lot of in-guest testing to find the stuff that "broke", like cache reporting: https://lore.kernel.org/all/254ca48a67779ccf9b9f60e2bb5796a305c03f95.camel@infradead.org/ ... and the GICD_IIDR thing which I reposted today: https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ Those are the ones I came up against recently because someone had just *reverted* the offending commits local in a previous kernel upgrade, and I'm trying to fix it *properly* this time around and not carry the reverts forward for ever. And fix the expectations too, of course. Being told that we shouldn't *expect* to be able to upgrade and roll back the kernel while remaining compatible is... not OK. > By the way, you didn't Cc Marc... Ah crap, I meant to. Thanks for spotting that! I must have screwed up when I combined and dedeuplicated the get_maintainer.pl output with the recipients of the IIDR patch series. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-11 16:56 ` Paolo Bonzini 2026-05-11 17:53 ` David Woodhouse @ 2026-05-13 8:42 ` Marc Zyngier 2026-05-13 9:24 ` David Woodhouse 1 sibling, 1 reply; 29+ messages in thread From: Marc Zyngier @ 2026-05-13 8:42 UTC (permalink / raw) To: Paolo Bonzini Cc: David Woodhouse, Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Mon, 11 May 2026 17:56:15 +0100, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 5/11/26 18:38, David Woodhouse wrote: > > Not *everything* is in CPUID; one recent exception that comes to mind > > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the > > existing behaviour of older kernels — even when that behaviour doesn't > > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would > > *advertise* the feature, but not actually *implement* it. Nevertheless, > > that remains the default behaviour of future kernels unless userspace > > explicitly opts in to fully enable (or disable) the feature. > > > > But this documentation update isn't even asking for that compatible-by- > > default behaviour, even though that is the right thing to do. It's only > > asking that it be *possible* to reinstate the old behaviour, for > > userspace that *knows* about the change and explicitly wants to go back > > to the old way to remain compatible. > > Yep, these are the "quirks"---if it's too early for Arm to commit to > that, I guess it's fine. Compatible by default means nothing, because userspace needs to discover the combined capabilities of the host and KVM. This is not a "CPU model" architecture. If userspace is not a total joke, it will read all the ID registers, and configure what it wants to see, assuming it is a feature that can be configured (not everything can, because the architecture itself is not fully backward compatible). Yes, this is buggy at times, because the combinatorial explosion of CPU capabilities and supported features makes it pretty hard to test (and really nobody actually does). But overall, it works, and QEMU is growing an infrastructure to manage it in a "user friendly" way. But really, this isn't what David is asking. He's demanding "bug for bug" compatibility. For that, we have two possible cases: - this is a behaviour that, while undesirable, is allowed by the architecture: fine, we preserve the behaviour and add another way to expose the one we really want. it is ugly, but we manage. - this is a behaviour that is not allowed by the architecture: we fix it for good. We do that on every release. Some minor, some much more visible. And there is no way we will add this sort of "bring the bugs back" type of behaviours. Specially when it is really obvious that no SW can make any reasonable use of the defect. We allow userspace to keep behaving as before, but the guest will not see a non-compliant behaviour. That being said, there is a way out of that: convince people in charge of the architecture that the non-compliant KVM behaviour is actually valuable, and deserves to be tolerated. This has happened before (VHE only and NV2 only, just to name two recent changes). Other terrible hacks (such as GICv3's GICD_TYPER.num_LPIs which KVM doesn't support) were added at the request of cloud vendors that David might be familiar with, so it isn't like it is a brand new process. And once it is in the architecture, it becomes a behaviour that is allowed to be exposed to a guest, for better or worse. These are the rules we have followed since we started KVM/arm, and I intend to stick to them. Thanks, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 8:42 ` Marc Zyngier @ 2026-05-13 9:24 ` David Woodhouse 2026-05-13 12:43 ` Paolo Bonzini 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-13 9:24 UTC (permalink / raw) To: Marc Zyngier, Paolo Bonzini Cc: Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 4564 bytes --] On Wed, 2026-05-13 at 09:42 +0100, Marc Zyngier wrote: > On Mon, 11 May 2026 17:56:15 +0100, > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 5/11/26 18:38, David Woodhouse wrote: > > > Not *everything* is in CPUID; one recent exception that comes to mind > > > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the > > > existing behaviour of older kernels — even when that behaviour doesn't > > > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would > > > *advertise* the feature, but not actually *implement* it. Nevertheless, > > > that remains the default behaviour of future kernels unless userspace > > > explicitly opts in to fully enable (or disable) the feature. > > > > > > But this documentation update isn't even asking for that compatible-by- > > > default behaviour, even though that is the right thing to do. It's only > > > asking that it be *possible* to reinstate the old behaviour, for > > > userspace that *knows* about the change and explicitly wants to go back > > > to the old way to remain compatible. > > > > Yep, these are the "quirks"---if it's too early for Arm to commit to > > that, I guess it's fine. > > Compatible by default means nothing, because userspace needs to > discover the combined capabilities of the host and KVM. This is not a > "CPU model" architecture. > > If userspace is not a total joke, it will read all the ID registers, > and configure what it wants to see, assuming it is a feature that can > be configured (not everything can, because the architecture itself is > not fully backward compatible). > > Yes, this is buggy at times, because the combinatorial explosion of > CPU capabilities and supported features makes it pretty hard to test > (and really nobody actually does). But overall, it works, and QEMU is > growing an infrastructure to manage it in a "user friendly" way. Yes, that is precisely what I'm asking for. I'm prepared to deal with the fact that KVM/Arm64 is not a stable and mature platform like x86 is, and that userspace has to find all the random changes from one version to the next, and explicitly pin things down to be compatible. All I'm asking for is that KVM makes it *possible* to pin things down to the behaviour of previously released Linux/KVM kernels. > But really, this isn't what David is asking. He's demanding "bug for > bug" compatibility. For that, we have two possible cases: No, I am not asking you to meet that bar. I merely observed that x86 does and that it would be nice. But we are a *long* way from that. > - this is a behaviour that, while undesirable, is allowed by the > architecture: fine, we preserve the behaviour and add another way to > expose the one we really want. it is ugly, but we manage. > > - this is a behaviour that is not allowed by the architecture: we fix > it for good. We do that on every release. Some minor, some much more > visible. And there is no way we will add this sort of "bring the > bugs back" type of behaviours. Specially when it is really obvious > that no SW can make any reasonable use of the defect. We allow > userspace to keep behaving as before, but the guest will not see a > non-compliant behaviour. > > That being said, there is a way out of that: convince people in charge > of the architecture that the non-compliant KVM behaviour is actually > valuable, and deserves to be tolerated. This has happened before (VHE > only and NV2 only, just to name two recent changes). > > Other terrible hacks (such as GICv3's GICD_TYPER.num_LPIs which KVM > doesn't support) were added at the request of cloud vendors that David > might be familiar with, so it isn't like it is a brand new process. > > And once it is in the architecture, it becomes a behaviour that is > allowed to be exposed to a guest, for better or worse. Marc, this is complete nonsense and you should know better. Once a behaviour is present in a released version of Linux/KVM, we can't just declare it "wrong" and unilaterally impose a change in guest-visible behaviour on *running* guests as a side-effect of a kernel upgrade. The criterion for *KVM* to remain compatible is "once it has been in a released version of the kernel". Not "once it is in the architecture". > These are the rules we have followed since we started KVM/arm, and I > intend to stick to them. Then KVM/arm is falling far short of the standards we expect of KVM and of Linux in general. Please do better. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 9:24 ` David Woodhouse @ 2026-05-13 12:43 ` Paolo Bonzini 2026-05-13 13:03 ` Eric Auger 2026-05-13 13:57 ` David Woodhouse 0 siblings, 2 replies; 29+ messages in thread From: Paolo Bonzini @ 2026-05-13 12:43 UTC (permalink / raw) To: David Woodhouse, Marc Zyngier Cc: Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On 5/13/26 11:24, David Woodhouse wrote: > On Wed, 2026-05-13 at 09:42 +0100, Marc Zyngier wrote: >> If userspace is not a total joke, it will read all the ID registers, >> and configure what it wants to see, assuming it is a feature that can >> be configured (not everything can, because the architecture itself is >> not fully backward compatible). >> >> Yes, this is buggy at times, because the combinatorial explosion of >> CPU capabilities and supported features makes it pretty hard to test >> (and really nobody actually does). But overall, it works, and QEMU is >> growing an infrastructure to manage it in a "user friendly" way. > > Yes, that is precisely what I'm asking for. I'm prepared to deal with > the fact that KVM/Arm64 is not a stable and mature platform like x86 > is, and that userspace has to find all the random changes from one > version to the next, and explicitly pin things down to be compatible. > > All I'm asking for is that KVM makes it *possible* to pin things down > to the behaviour of previously released Linux/KVM kernels. > >> But really, this isn't what David is asking. He's demanding "bug for >> bug" compatibility. For that, we have two possible cases: > > No, I am not asking you to meet that bar. I merely observed that x86 > does and that it would be nice. But we are a *long* way from that. x86 doesn't do bug-for-bug compatibility, thankfully - we have quirks but only 11 of them, or about one per year since we started adding them. We only add quirks, generally speaking, when 1) we change the way file descriptors are initialized, 2) guests in the wild were relying on it, or 3) it prevends restoring state saved from an old kernel. Is there anything else? So you're asking something not really far from this: >> - this is a behaviour that is not allowed by the architecture: we fix >> it for good. We do that on every release. Some minor, some much more >> visible. And there is no way we will add this sort of "bring the >> bugs back" type of behaviours. Specially when it is really obvious >> that no SW can make any reasonable use of the defect. We allow >> userspace to keep behaving as before, but the guest will not see a >> non-compliant behaviour. ... where for example https://lore.kernel.org/kvm/e03f092dfbb7d391a6bf2797ba01e122ba080bcd.camel@infradead.org/ is an example of a bug that "no SW can make any reasonable use of". > Marc, this is complete nonsense and you should know better. > Once a behaviour is present in a released version of Linux/KVM, we > can't just declare it "wrong" and unilaterally impose a change in > guest-visible behaviour on *running* guests as a side-effect of a > kernel upgrade. > > The criterion for *KVM* to remain compatible is "once it has been in a > released version of the kernel". Not "once it is in the architecture". That is *also* obviously nonsense though, isn't it (see example above)? The truth is in the middle, "once it is in the architecture" is likely too narrow but "once it is in a Linux release" is way too broad. And besides, both miss the point of *configurability* which is the basis of it all. The main difference between x86 and Arm is the default state at creation; x86 defaults to a blank slate, mostly; and when we didn't do that, we regretted it later (cue the STUFF_FEATURE_MSRS quirk). It's too late to change the behavior for Arm, but I think we can agree that patches such as https://lore.kernel.org/kvm/20260511113558.3325004-2-dwmw2@infradead.org/ ("KVM: arm64: vgic: Allow userspace to set IIDR revision 1") are what the letter and spirit of this proposal is about. Marc did not mention having to deal with guests in the wild. Let's ignore it for now because even defining "guests in the wild" is hard; and anyway it's not related to the patch that triggered the discussion. So we have the third case, "restoring state saved from an old kernel". If this case arises, I do believe that Arm will have to deal with it and introduce quirks or KVM_GET/SET_REG hacks. Maybe it hasn't happened yet, lucky you. Overall, even if we may disagree about the details, are we really on terribly distant grounds, or are we not? Paolo ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 12:43 ` Paolo Bonzini @ 2026-05-13 13:03 ` Eric Auger 2026-05-13 13:57 ` David Woodhouse 1 sibling, 0 replies; 29+ messages in thread From: Eric Auger @ 2026-05-13 13:03 UTC (permalink / raw) To: Paolo Bonzini, David Woodhouse, Marc Zyngier Cc: Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest Hi, On 5/13/26 2:43 PM, Paolo Bonzini wrote: > On 5/13/26 11:24, David Woodhouse wrote: >> On Wed, 2026-05-13 at 09:42 +0100, Marc Zyngier wrote: >>> If userspace is not a total joke, it will read all the ID registers, >>> and configure what it wants to see, assuming it is a feature that can >>> be configured (not everything can, because the architecture itself is >>> not fully backward compatible). >>> >>> Yes, this is buggy at times, because the combinatorial explosion of >>> CPU capabilities and supported features makes it pretty hard to test >>> (and really nobody actually does). But overall, it works, and QEMU is >>> growing an infrastructure to manage it in a "user friendly" way. >> >> Yes, that is precisely what I'm asking for. I'm prepared to deal with >> the fact that KVM/Arm64 is not a stable and mature platform like x86 >> is, and that userspace has to find all the random changes from one >> version to the next, and explicitly pin things down to be compatible. >> >> All I'm asking for is that KVM makes it *possible* to pin things down >> to the behaviour of previously released Linux/KVM kernels. >> >>> But really, this isn't what David is asking. He's demanding "bug for >>> bug" compatibility. For that, we have two possible cases: >> >> No, I am not asking you to meet that bar. I merely observed that x86 >> does and that it would be nice. But we are a *long* way from that. > > x86 doesn't do bug-for-bug compatibility, thankfully - we have quirks > but only 11 of them, or about one per year since we started adding > them. We only add quirks, generally speaking, when 1) we change the > way file descriptors are initialized, 2) guests in the wild were > relying on it, or 3) it prevends restoring state saved from an old > kernel. Is there anything else? > > So you're asking something not really far from this: > >>> - this is a behaviour that is not allowed by the architecture: we fix >>> it for good. We do that on every release. Some minor, some much more >>> visible. And there is no way we will add this sort of "bring the >>> bugs back" type of behaviours. Specially when it is really obvious >>> that no SW can make any reasonable use of the defect. We allow >>> userspace to keep behaving as before, but the guest will not see a >>> non-compliant behaviour. > > ... where for example > https://lore.kernel.org/kvm/e03f092dfbb7d391a6bf2797ba01e122ba080bcd.camel@infradead.org/ > is an example of a bug that "no SW can make any reasonable use of". > >> Marc, this is complete nonsense and you should know better. >> Once a behaviour is present in a released version of Linux/KVM, we >> can't just declare it "wrong" and unilaterally impose a change in >> guest-visible behaviour on *running* guests as a side-effect of a >> kernel upgrade. >> >> The criterion for *KVM* to remain compatible is "once it has been in a >> released version of the kernel". Not "once it is in the architecture". > > That is *also* obviously nonsense though, isn't it (see example > above)? The truth is in the middle, "once it is in the architecture" > is likely too narrow but "once it is in a Linux release" is way too > broad. And besides, both miss the point of *configurability* which is > the basis of it all. > > The main difference between x86 and Arm is the default state at > creation; x86 defaults to a blank slate, mostly; and when we didn't do > that, we regretted it later (cue the STUFF_FEATURE_MSRS quirk). It's > too late to change the behavior for Arm, but I think we can agree that > patches such as > https://lore.kernel.org/kvm/20260511113558.3325004-2-dwmw2@infradead.org/ > ("KVM: arm64: vgic: Allow userspace to set IIDR revision 1") are what > the letter and spirit of this proposal is about. > > Marc did not mention having to deal with guests in the wild. Let's > ignore it for now because even defining "guests in the wild" is hard; > and anyway it's not related to the patch that triggered the discussion. > > So we have the third case, "restoring state saved from an old kernel". > If this case arises, I do believe that Arm will have to deal with it > and introduce quirks or KVM_GET/SET_REG hacks. Maybe it hasn't > happened yet, lucky you. for info, this qemu series was merged laterly. [PATCH v10 0/7] Mitigation of "failed to load cpu:cpreg_vmstate_array_len" migration failures <https://lore.kernel.org/all/20260420140552.104369-1-eric.auger@redhat.com/#r> https://lore.kernel.org/all/20260420140552.104369-1-eric.auger@redhat.com/#r It brings an infrastructure to mitigate some migration failures accross different kernel versions. Also there is [PATCH v4 00/17] kvm/arm: Introduce a customizable aarch64 KVM host model, under review https://lore.kernel.org/all/20260503073541.790215-1-eric.auger@redhat.com/ This series aims at beeing able to offer the capacity to set writable ID regs on the host passthrough vcpu model. Thanks Eric > > Overall, even if we may disagree about the details, are we really on > terribly distant grounds, or are we not? > > Paolo > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 12:43 ` Paolo Bonzini 2026-05-13 13:03 ` Eric Auger @ 2026-05-13 13:57 ` David Woodhouse 2026-05-13 16:24 ` Paolo Bonzini 1 sibling, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-13 13:57 UTC (permalink / raw) To: Paolo Bonzini, Marc Zyngier Cc: Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 7241 bytes --] On Wed, 2026-05-13 at 14:43 +0200, Paolo Bonzini wrote: > On 5/13/26 11:24, David Woodhouse wrote: > > On Wed, 2026-05-13 at 09:42 +0100, Marc Zyngier wrote: > > > If userspace is not a total joke, it will read all the ID registers, > > > and configure what it wants to see, assuming it is a feature that can > > > be configured (not everything can, because the architecture itself is > > > not fully backward compatible). > > > > > > Yes, this is buggy at times, because the combinatorial explosion of > > > CPU capabilities and supported features makes it pretty hard to test > > > (and really nobody actually does). But overall, it works, and QEMU is > > > growing an infrastructure to manage it in a "user friendly" way. > > > > Yes, that is precisely what I'm asking for. I'm prepared to deal with > > the fact that KVM/Arm64 is not a stable and mature platform like x86 > > is, and that userspace has to find all the random changes from one > > version to the next, and explicitly pin things down to be compatible. > > > > All I'm asking for is that KVM makes it *possible* to pin things down > > to the behaviour of previously released Linux/KVM kernels. > > > > > But really, this isn't what David is asking. He's demanding "bug for > > > bug" compatibility. For that, we have two possible cases: > > > > No, I am not asking you to meet that bar. I merely observed that x86 > > does and that it would be nice. But we are a *long* way from that. > > x86 doesn't do bug-for-bug compatibility, thankfully - we have quirks > but only 11 of them, or about one per year since we started adding them. > We only add quirks, generally speaking, when 1) we change the way file > descriptors are initialized, 2) guests in the wild were relying on it, > or 3) it prevends restoring state saved from an old kernel. Is there > anything else? > > So you're asking something not really far from this: > > > > - this is a behaviour that is not allowed by the architecture: we fix > > > it for good. We do that on every release. Some minor, some much more > > > visible. And there is no way we will add this sort of "bring the > > > bugs back" type of behaviours. Specially when it is really obvious > > > that no SW can make any reasonable use of the defect. We allow > > > userspace to keep behaving as before, but the guest will not see a > > > non-compliant behaviour. > > ... where for example > https://lore.kernel.org/kvm/e03f092dfbb7d391a6bf2797ba01e122ba080bcd.camel@infradead.org/ > is an example of a bug that "no SW can make any reasonable use of". I actually believe that the focus on ICEBP was triggered by some weird gaming software's anti-DRM mechanism, and that it *did* affect actual guests in the wild? But yeah, *fixing* it should not have any adverse effects. That's the key. > > Marc, this is complete nonsense and you should know better. > > Once a behaviour is present in a released version of Linux/KVM, we > > can't just declare it "wrong" and unilaterally impose a change in > > guest-visible behaviour on *running* guests as a side-effect of a > > kernel upgrade. > > > > The criterion for *KVM* to remain compatible is "once it has been in a > > released version of the kernel". Not "once it is in the architecture". > > That is *also* obviously nonsense though, isn't it (see example above)? > The truth is in the middle, "once it is in the architecture" is likely > too narrow but "once it is in a Linux release" is way too broad. How about "once it is in a Linux release and guest visible, and unless we *know* that changing it in either direction underneath running guests cannot cause problems". > And besides, both miss the point of *configurability* which is the basis of > it all. Hm, configurability *is* the point, I thought. I'm not asking for the *default* to remain compatible. I only ask that a VMM *can* ask KVM for guest-visible things to remain the same as before. > The main difference between x86 and Arm is the default state at > creation; x86 defaults to a blank slate, mostly; and when we didn't do > that, we regretted it later (cue the STUFF_FEATURE_MSRS quirk). It's > too late to change the behavior for Arm, but I think we can agree that > patches such as > https://lore.kernel.org/kvm/20260511113558.3325004-2-dwmw2@infradead.org/ > ("KVM: arm64: vgic: Allow userspace to set IIDR revision 1") are what > the letter and spirit of this proposal is about. Yes. That *exact* patch. > Marc did not mention having to deal with guests in the wild. Let's > ignore it for now because even defining "guests in the wild" is hard; > and anyway it's not related to the patch that triggered the discussion. > > So we have the third case, "restoring state saved from an old kernel". > If this case arises, I do believe that Arm will have to deal with it and > introduce quirks or KVM_GET/SET_REG hacks. Maybe it hasn't happened > yet, lucky you. We literally have those mechanisms already. That's exactly what the revision field in the IIDR is used for: https://developer.arm.com/documentation/111107/2026-03/External-Registers/GICD-IIDR--Distributor-Implementer-Identification-Register See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a new guest-visible feature in revision 3, but allowed userspace to restore the old behaviour by setting it to revision 2. (Or at least intended to; there was a separate bug which stopped it working, which I already fixed last week.) All my patch above does, is make it possible to set it to revision 1 as well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously changed the behaviour and bumped the default to 2 *without* allowing userspace to restore the prior behaviour, and we've been carrying a *revert* of that patch. So the patch we're arguing about is just making that earlier guest- visible change optional in precisely the way that is already designed into KVM, and has been used for the subsequent change. Why would we *not* accept such a patch? It's not like I'm trying to upstream something like https://david.woodhou.se/0001-Allow-writes-via-newly-readonly-PTE-for-buggy-Ubuntu.patch ... but yes, those *are* the lengths we have to go to sometimes to ensure that when we upgrade the hosting environment, guests which have worked for years don't suddenly break — however much they DESERVE to :) > Overall, even if we may disagree about the details, are we really on > terribly distant grounds, or are we not? I genuinely have no idea. On one hand, no we are not terribly distant. All the mechanisms to do this properly already *exist*, and the fix I'm asking for is not much more than a one-liner to fix up the previous oversight. But on the other hand, Marc seems terribly insistent that we SHOULD NOT restore the behaviour that older KVM offered to guests, and we MUST change it unconditionally underneath running guests, making these registers writable on upgrade... and reverting them to read-only for running guests on a rollback. And there we do have a very different viewpoint. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 13:57 ` David Woodhouse @ 2026-05-13 16:24 ` Paolo Bonzini 2026-05-13 18:26 ` David Woodhouse 2026-05-19 10:41 ` David Woodhouse 0 siblings, 2 replies; 29+ messages in thread From: Paolo Bonzini @ 2026-05-13 16:24 UTC (permalink / raw) To: David Woodhouse Cc: Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest Il mer 13 mag 2026, 15:57 David Woodhouse <dwmw2@infradead.org> ha scritto: > > x86 doesn't do bug-for-bug compatibility, thankfully - we have quirks > > but only 11 of them, or about one per year since we started adding them. > > We only add quirks, generally speaking, when 1) we change the way file > > descriptors are initialized, 2) guests in the wild were relying on it, > > or 3) it prevends restoring state saved from an old kernel. Is there > > anything else? > > > > https://lore.kernel.org/kvm/e03f092dfbb7d391a6bf2797ba01e122ba080bcd.camel@infradead.org/ > > is an example of a bug that "no SW can make any reasonable use of". > > I actually believe that the focus on ICEBP was triggered by some weird > gaming software's anti-DRM mechanism, and that it *did* affect actual > guests in the wild? > > But yeah, *fixing* it should not have any adverse effects. That's the > key. Yep, so "bug for bug" is not it. > > That is *also* obviously nonsense though, isn't it (see example above)? > > The truth is in the middle, "once it is in the architecture" is likely > > too narrow but "once it is in a Linux release" is way too broad. > > How about "once it is in a Linux release and guest visible, and unless > we *know* that changing it in either direction underneath running > guests cannot cause problems". > > > And besides, both miss the point of *configurability* which is the basis of > > it all. > > Hm, configurability *is* the point, I thought. Yes, and configurability goes way beyond bugs/quirks, which are to some extent a red herring. Configurability for example says that "KVM: arm64: vgic: Allow userspace to set IIDR revision 1" shouldn't be controversial at all. > > So we have the third case, "restoring state saved from an old kernel". > > If this case arises, I do believe that Arm will have to deal with it and > > introduce quirks or KVM_GET/SET_REG hacks. Maybe it hasn't happened > > yet, lucky you. > > We literally have those mechanisms already. I am not talking about guest-visible changes across save/restore here, but rather about round-trips through userspace. For example, see the effect of KVM_X2APIC_API_USE_32BIT_IDS on KVM_GET/SET_LAPIC: it couldn't be made the default, because userspace expects to take old data returned by KVM_GET_LAPIC and shove it into KVM_SET_LAPIC. Sucks but can't be avoided. > See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a > new guest-visible feature in revision 3, but allowed userspace to > restore the old behaviour by setting it to revision 2. All my patch above does, is make it possible to set it to revision 1 as > well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously > changed the behaviour and bumped the default to 2 *without* allowing > userspace to restore the prior behaviour, and we've been carrying a > *revert* of that patch. > > Why would we *not* accept such a patch? Agreed. Even ignoring your revert, there's no reason why any upgrade past 49a1a2c70a7f has to be from after d53c2c29ae0d. > Marc seems terribly insistent that we SHOULD NOT > restore the behaviour that older KVM offered to guests, and we MUST > change it unconditionally underneath running guests, making these > registers writable on upgrade... and reverting them to read-only for > running guests on a rollback. > > And there we do have a very different viewpoint. That's the design decision I mentioned, of not starting the guest configuration from a clean slate. I believe it complicates things because you have to design from the beginning with the ability to rollback to old versions and to potentially detect conflicts introduced by the rollback. This is exactly why KVM_X86_QUIRK_STUFF_FEATURE_MSRS was introduced: "KVM's initialization of feature MSRs during vCPU creation results in a failed save/restore of PERF_CAPABILITIES. If userspace configures the VM to _not_ have a PMU, because KVM initializes the vCPU's PERF_CAPABILITIES, trying to save/restore the non-zero value will be rejected by the destination." (https://lkml.org/lkml/2024/8/2/1032) For Arm, however, it may be too late to change it; if not, I'll happily watch you argue with Marc about it. But even without that, this doc patch (and the idea that "Where a new kernel introduces a guest-visible change, it provides a mechanism for userspace to select the previous behaviour") should be uncontroversial. Paolo ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 16:24 ` Paolo Bonzini @ 2026-05-13 18:26 ` David Woodhouse 2026-05-19 10:41 ` David Woodhouse 1 sibling, 0 replies; 29+ messages in thread From: David Woodhouse @ 2026-05-13 18:26 UTC (permalink / raw) To: Paolo Bonzini Cc: Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 6837 bytes --] On Wed, 2026-05-13 at 18:24 +0200, Paolo Bonzini wrote: > Il mer 13 mag 2026, 15:57 David Woodhouse <dwmw2@infradead.org> ha scritto: > > > x86 doesn't do bug-for-bug compatibility, thankfully - we have quirks > > > but only 11 of them, or about one per year since we started adding them. > > > We only add quirks, generally speaking, when 1) we change the way file > > > descriptors are initialized, 2) guests in the wild were relying on it, > > > or 3) it prevends restoring state saved from an old kernel. Is there > > > anything else? > > > > > > https://lore.kernel.org/kvm/e03f092dfbb7d391a6bf2797ba01e122ba080bcd.camel@infradead.org/ > > > is an example of a bug that "no SW can make any reasonable use of". > > > > I actually believe that the focus on ICEBP was triggered by some weird > > gaming software's anti-DRM mechanism, and that it *did* affect actual > > guests in the wild? > > > > But yeah, *fixing* it should not have any adverse effects. That's the > > key. > > Yep, so "bug for bug" is not it. Of course. I'm not discriminating between 'bugs' and 'features'. In this context I only care about guest-visible behaviour changes, whatever the reason. What I said was: > > > > Once a behaviour is present in a released version of Linux/KVM, we > > > > can't just declare it "wrong" and unilaterally impose a change in > > > > guest-visible behaviour on *running* guests as a side-effect of a > > > > kernel upgrade. And yes, you're technically right to challenge that phrasing of it. It does need the additional caveat of "...unless we are sure that changing it in either direction underneath running guests cannot cause problems", as discussed. That's the key for the ICEBP thing. > > > > > And besides, both miss the point of *configurability* which is the basis of > > > it all. > > > > Hm, configurability *is* the point, I thought. > > Yes, and configurability goes way beyond bugs/quirks, which are to > some extent a red herring. Configurability for example says that "KVM: > arm64: vgic: Allow userspace to set IIDR revision 1" shouldn't be > controversial at all. Indeed it shouldn't. And yet here we are. > > > So we have the third case, "restoring state saved from an old kernel". > > > If this case arises, I do believe that Arm will have to deal with it and > > > introduce quirks or KVM_GET/SET_REG hacks. Maybe it hasn't happened > > > yet, lucky you. > > > > We literally have those mechanisms already. > > I am not talking about guest-visible changes across save/restore here, > but rather about round-trips through userspace. For example, see the > effect of KVM_X2APIC_API_USE_32BIT_IDS on KVM_GET/SET_LAPIC: it > couldn't be made the default, because userspace expects to take old > data returned by KVM_GET_LAPIC and shove it into KVM_SET_LAPIC. Sucks > but can't be avoided. Yes, you're right. And I fully expect and trust x86 to get that right and not break existing userspace in any way at all. But honestly, the bar for Arm is so low right now that anything I physically *can* work around in userspace, I'm prepared to tolerate. If KVM/arm did the equivalent of just changing the KVM_[SG]ET_LAPIC data without the KVM_X2APIC_API_USE_32BIT_IDS trick, I wouldn't even bat an eyelid; I'd just accommodate it and move on. > > See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a > > new guest-visible feature in revision 3, but allowed userspace to > > restore the old behaviour by setting it to revision 2. All my patch above does, is make it possible to set it to revision 1 as > > well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously > > changed the behaviour and bumped the default to 2 *without* allowing > > userspace to restore the prior behaviour, and we've been carrying a > > *revert* of that patch. > > > > Why would we *not* accept such a patch? > > Agreed. Even ignoring your revert, there's no reason why any upgrade > past 49a1a2c70a7f has to be from after d53c2c29ae0d. > > > Marc seems terribly insistent that we SHOULD NOT > > restore the behaviour that older KVM offered to guests, and we MUST > > change it unconditionally underneath running guests, making these > > registers writable on upgrade... and reverting them to read-only for > > running guests on a rollback. > > > > And there we do have a very different viewpoint. > > That's the design decision I mentioned, of not starting the guest > configuration from a clean slate. I believe it complicates things > because you have to design from the beginning with the ability to > rollback to old versions and to potentially detect conflicts > introduced by the rollback. This is exactly why > KVM_X86_QUIRK_STUFF_FEATURE_MSRS was introduced: "KVM's initialization > of feature MSRs during vCPU creation results in a failed save/restore > of PERF_CAPABILITIES. If userspace configures the VM to _not_ have a > PMU, because KVM initializes the vCPU's PERF_CAPABILITIES, trying to > save/restore the non-zero value will be rejected by the destination." > (https://lkml.org/lkml/2024/8/2/1032) No, I don't think this is like that. In that case, IIUC it was at least *possible* for userspace to manually filter out capabilities and adjust things. But it kind of sucked if we *made* userspace do that and broke things for existing userspace, so of *course* x86 did better. I'm not even *dreaming* about a world where KVM/arm meets that bar. > For Arm, however, it may be too late to change it; if not, I'll > happily watch you argue with Marc about it. I'm not even going to try. You're right that it's the better option, and it most certainly *isn't* too late for Arm to choose to be a stable and mature platform providing continuity to userspace like x86 does. But we are *so* far from that right now; we're fighting even to have the *possibility* for userspace to remain compatible — even if userspace *is* updated to know everything that the latest kernel changed underneath it. > But even without that, > this doc patch (and the idea that "Where a new kernel introduces a > guest-visible change, it provides a mechanism for userspace to select > the previous behaviour") should be uncontroversial. Indeed. And again, if you really want then you can add the caveat discussed above, "unless you're really sure it won't make *any* difference to the zoo of possible guests running Linux, Windows, FreeBSD, or any number of random home-grown or network appliance operating systems". Although I didn't think it really needed spelling out in the doc, just as I didn't think it needed spelling out earlier today (although you called my sentence nonsense purely because it lacked that obvious caveat, AFAICT). [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-13 16:24 ` Paolo Bonzini 2026-05-13 18:26 ` David Woodhouse @ 2026-05-19 10:41 ` David Woodhouse 2026-05-19 11:11 ` Will Deacon 1 sibling, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 10:41 UTC (permalink / raw) To: Paolo Bonzini Cc: Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 1763 bytes --] On Wed, 2026-05-13 at 18:24 +0200, Paolo Bonzini wrote: > > > See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a > > new guest-visible feature in revision 3, but allowed userspace to > > restore the old behaviour by setting it to revision 2. All my patch > > above does, is make it possible to set it to revision 1 as > > well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously > > changed the behaviour and bumped the default to 2 *without* allowing > > userspace to restore the prior behaviour, and we've been carrying a > > *revert* of that patch. > > > > Why would we *not* accept such a patch? > > Agreed. Even ignoring your revert, there's no reason why any upgrade > past 49a1a2c70a7f has to be from after d53c2c29ae0d. So where do we go from here? I assume you'll be taking this Documentation patch via the KVM tree? But what about the actual fix at https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ This is a simple and unintrusive bug fix to make KVM/arm64 follow the "common sense" requirement that the doc patch codifies, apparently being rejected with the rather bizarre claim that KVM has no *need* to maintain guest-visible compatibility across host kernel changes. So... what next? Is one of the other KVM/arm64 maintainers going to speak up? Paolo would you consider taking the fixes through your tree directly? Does Arm not actually *care* whether AArch64 is considered a stable and mature platform for KVM hosting? We don't have CONFIG_EXPERIMENTAL any more, do we? Or perhaps we could mark it such. Is CONFIG_STAGING the right thing, for unstable things which might violate the normal maturity expectations of the kernel? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 10:41 ` David Woodhouse @ 2026-05-19 11:11 ` Will Deacon 2026-05-19 11:44 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Will Deacon @ 2026-05-19 11:11 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, May 19, 2026 at 11:41:26AM +0100, David Woodhouse wrote: > On Wed, 2026-05-13 at 18:24 +0200, Paolo Bonzini wrote: > > > > > See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a > > > new guest-visible feature in revision 3, but allowed userspace to > > > restore the old behaviour by setting it to revision 2. All my patch > > > above does, is make it possible to set it to revision 1 as > > > well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously > > > changed the behaviour and bumped the default to 2 *without* allowing > > > userspace to restore the prior behaviour, and we've been carrying a > > > *revert* of that patch. > > > > > > Why would we *not* accept such a patch? > > > > Agreed. Even ignoring your revert, there's no reason why any upgrade > > past 49a1a2c70a7f has to be from after d53c2c29ae0d. > > So where do we go from here? > > I assume you'll be taking this Documentation patch via the KVM tree? > > But what about the actual fix at > https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ > > This is a simple and unintrusive bug fix to make KVM/arm64 follow the > "common sense" requirement that the doc patch codifies, apparently > being rejected with the rather bizarre claim that KVM has no *need* to > maintain guest-visible compatibility across host kernel changes. > > So... what next? Is one of the other KVM/arm64 maintainers going to > speak up? Paolo would you consider taking the fixes through your tree > directly? > > Does Arm not actually *care* whether AArch64 is considered a stable and > mature platform for KVM hosting? Hey, come on. Marc cares more about this stuff than anybody else on the planet. He's been single-handedly maintaining the tree for the past couple of releases while Oliver was out and he's on the end of a _lot_ of patches. I'm only cc'd on a fraction of the KVM/arm64 changes and it's bedlam. Will ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 11:11 ` Will Deacon @ 2026-05-19 11:44 ` David Woodhouse 2026-05-19 12:13 ` Paolo Bonzini 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 11:44 UTC (permalink / raw) To: Will Deacon Cc: Paolo Bonzini, Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 3074 bytes --] On Tue, 2026-05-19 at 12:11 +0100, Will Deacon wrote: > On Tue, May 19, 2026 at 11:41:26AM +0100, David Woodhouse wrote: > > On Wed, 2026-05-13 at 18:24 +0200, Paolo Bonzini wrote: > > > > > > > See commit https://git.kernel.org/torvalds/c/49a1a2c70a7f which adds a > > > > new guest-visible feature in revision 3, but allowed userspace to > > > > restore the old behaviour by setting it to revision 2. All my patch > > > > above does, is make it possible to set it to revision 1 as > > > > well. Because https://git.kernel.org/torvalds/c/d53c2c29ae0d previously > > > > changed the behaviour and bumped the default to 2 *without* allowing > > > > userspace to restore the prior behaviour, and we've been carrying a > > > > *revert* of that patch. > > > > > > > > Why would we *not* accept such a patch? > > > > > > Agreed. Even ignoring your revert, there's no reason why any upgrade > > > past 49a1a2c70a7f has to be from after d53c2c29ae0d. > > > > So where do we go from here? > > > > I assume you'll be taking this Documentation patch via the KVM tree? > > > > But what about the actual fix at > > https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ > > > > This is a simple and unintrusive bug fix to make KVM/arm64 follow the > > "common sense" requirement that the doc patch codifies, apparently > > being rejected with the rather bizarre claim that KVM has no *need* to > > maintain guest-visible compatibility across host kernel changes. > > > > So... what next? Is one of the other KVM/arm64 maintainers going to > > speak up? Paolo would you consider taking the fixes through your tree > > directly? > > > > Does Arm not actually *care* whether AArch64 is considered a stable and > > mature platform for KVM hosting? > > Hey, come on. Marc cares more about this stuff than anybody else on the > planet. He's been single-handedly maintaining the tree for the past > couple of releases while Oliver was out and he's on the end of a _lot_ > of patches. I'm only cc'd on a fraction of the KVM/arm64 changes and > it's bedlam. I certainly wouldn't disagree with any of that. The depth of knowledge and the amount of energy that Marc displays through this work is impressive, and I'm sure we all have an enormous amount of respect for it, and for him. I know I do. Nevertheless, the specific technical decision to reject the simple bug fix linked above is dead wrong. Because the principle under which it was rejected — the idea that KVM has no responsibility to maintain compatibility of guest-visible behaviour from one kernel version to the next — is also dead wrong. If KVM on arm64 doesn't aspire to maintain guest compatibility across host kernel changes — regardless of whether the previous kernel's behaviour was "blessed" by the architecture specification or not — then it does not meet the expectation that we have of KVM implementations in the Linux kernel. Or indeed the standards that we've held for Linux kernel ABIs for the last 35 years. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 11:44 ` David Woodhouse @ 2026-05-19 12:13 ` Paolo Bonzini 2026-05-19 12:38 ` Marc Zyngier 2026-05-19 12:42 ` David Woodhouse 0 siblings, 2 replies; 29+ messages in thread From: Paolo Bonzini @ 2026-05-19 12:13 UTC (permalink / raw) To: David Woodhouse Cc: Will Deacon, Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, May 19, 2026 at 1:44 PM David Woodhouse <dwmw2@infradead.org> wrote: > > > So... what next? Is one of the other KVM/arm64 maintainers going to > > > speak up? Paolo would you consider taking the fixes through your tree > > > directly? I admit that my knowledge of Arm is really limited, and I do not understand which IIDR values have architecturally allowed behaviors and which (if any) were made up by KVM; but even if I cannot honestly remark on the code or even the approach, a compatibility knob is the right thing to have. That's a userspace API design matter, not an Arm or GIC matter. I hope that Marc provides a better explanation of why he believes https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ shouldn't be accepted, because I am more than a bit puzzled about *why* that patch is being rejected or (in v3) so far ignored. Marc in this thread wrote: "If userspace is not a total joke, it will read all the ID registers, and configure what it wants to see, assuming it is a feature that can be configured (not everything can, because the architecture itself is not fully backward compatible)". But in this case there's an ID register that tells KVM if userspace wants the old or the new behavior, independent of whether that old behavior is architecturally valid or not. I will certainly take this patch, but I won't override Marc. However I'd like to better understand his point of view, because right now I just don't get it. > If KVM on arm64 doesn't aspire to maintain guest compatibility across > host kernel changes — regardless of whether the previous kernel's > behaviour was "blessed" by the architecture specification or not — then > it does not meet the expectation that we have of KVM implementations in > the Linux kernel. I agree with the "aspire" wording. Even if it's not going to be 100% achievable, KVM *needs* to aspire to maintain both guest compatibility and architecture precision. Sometimes it's impossible, sometimes there are constraints that require you to trade off one for another (e.g. via quirks, or by breaking behavior that no sane guest would have cared about). But in general as a maintainer you don't *get* to choose. Paolo > Or indeed the standards that we've held for Linux kernel ABIs for the > last 35 years. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:13 ` Paolo Bonzini @ 2026-05-19 12:38 ` Marc Zyngier 2026-05-19 12:56 ` Marc Zyngier 2026-05-19 12:59 ` David Woodhouse 2026-05-19 12:42 ` David Woodhouse 1 sibling, 2 replies; 29+ messages in thread From: Marc Zyngier @ 2026-05-19 12:38 UTC (permalink / raw) To: Paolo Bonzini Cc: David Woodhouse, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, 19 May 2026 13:13:41 +0100, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On Tue, May 19, 2026 at 1:44 PM David Woodhouse <dwmw2@infradead.org> wrote: > > > > So... what next? Is one of the other KVM/arm64 maintainers going to > > > > speak up? Paolo would you consider taking the fixes through your tree > > > > directly? > > I admit that my knowledge of Arm is really limited, and I do not > understand which IIDR values have architecturally allowed behaviors > and which (if any) were made up by KVM; but even if I cannot honestly > remark on the code or even the approach, a compatibility knob is the > right thing to have. That's a userspace API design matter, not an Arm > or GIC matter. I agree that we can have the knob -- not having it is a userspace issue, and I have said that I was OK with preserving the userspace interface. > > I hope that Marc provides a better explanation of why he believes > https://lore.kernel.org/all/20260511113558.3325004-2-dwmw2@infradead.org/ > shouldn't be accepted, because I am more than a bit puzzled about > *why* that patch is being rejected or (in v3) so far ignored. Marc in > this thread wrote: "If userspace is not a total joke, it will read all > the ID registers, and configure what it wants to see, assuming it is a > feature that can be configured (not everything can, because the > architecture itself is not fully backward compatible)". This was a more general comment on the full mechanism that we use to save/restore the state and at the same time configure the feature set. Which is what the GICD_IIDR does to some extent for the GIC. > But in this case there's an ID register that tells KVM if userspace > wants the old or the new behavior, independent of whether that old > behavior is architecturally valid or not. But the "old behaviour" makes no sense, and cannot be used by a guest: - either the guest doesn't use the alternative interrupt groups, then it wasn't affected by the bug. That's 100% of the guests. - or the guest did try to use the alternative groups, and it *NEVER* worked, as it wouldn't get any interrupt at all. What is the point of preserving a "feature" that only results in a non-working guest? Given that, re-introducing a behaviour that cannot be used makes zero sense to me. > I will certainly take this patch, but I won't override Marc. However > I'd like to better understand his point of view, because right now I > just don't get it. I don't get it either, but for different reasons. > > > If KVM on arm64 doesn't aspire to maintain guest compatibility across > > host kernel changes — regardless of whether the previous kernel's > > behaviour was "blessed" by the architecture specification or not — then > > it does not meet the expectation that we have of KVM implementations in > > the Linux kernel. > > I agree with the "aspire" wording. Even if it's not going to be 100% > achievable, KVM *needs* to aspire to maintain both guest compatibility > and architecture precision. Sometimes it's impossible, sometimes there > are constraints that require you to trade off one for another (e.g. > via quirks, or by breaking behavior that no sane guest would have > cared about). But in general as a maintainer you don't *get* to > choose. > > Paolo > > > Or indeed the standards that we've held for Linux kernel ABIs for the > > last 35 years. As I said before, I'd be OK with something that would restore IIDR to REV1. But not something that actively breaks the GIC emulation by reintroducing a bug. That's, by construction, dead code that will only bitrot, because there is no SW that can make use of this nonsense. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:38 ` Marc Zyngier @ 2026-05-19 12:56 ` Marc Zyngier 2026-05-19 13:24 ` David Woodhouse 2026-05-19 12:59 ` David Woodhouse 1 sibling, 1 reply; 29+ messages in thread From: Marc Zyngier @ 2026-05-19 12:56 UTC (permalink / raw) To: Paolo Bonzini Cc: David Woodhouse, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, 19 May 2026 13:38:57 +0100, Marc Zyngier <maz@kernel.org> wrote: > > As I said before, I'd be OK with something that would restore IIDR to > REV1. But not something that actively breaks the GIC emulation by > reintroducing a bug. That's, by construction, dead code that will only > bitrot, because there is no SW that can make use of this nonsense. I will also add that if we make it a policy to preserve buggy behaviours that the guest cannot be relying on, then I question whether we should be fixing anything at all. For example, 6.19 fixed a totally buggy behaviour where a guest couldn't not have more than (on most HW) 4 interrupts in flight at any given time. This was obviously totally bogus, and this was fixed unconditionally, as legitimate guests could experience gold-platted lock-ups. Should we revert to the previous behaviour? In the affirmative, I will simply stop fixing things, and someone else can have fun retrofitting buggy crap. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:56 ` Marc Zyngier @ 2026-05-19 13:24 ` David Woodhouse 0 siblings, 0 replies; 29+ messages in thread From: David Woodhouse @ 2026-05-19 13:24 UTC (permalink / raw) To: Marc Zyngier, Paolo Bonzini Cc: Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 2101 bytes --] On Tue, 2026-05-19 at 13:56 +0100, Marc Zyngier wrote: > On Tue, 19 May 2026 13:38:57 +0100, > Marc Zyngier <maz@kernel.org> wrote: > > > > As I said before, I'd be OK with something that would restore IIDR to > > REV1. But not something that actively breaks the GIC emulation by > > reintroducing a bug. That's, by construction, dead code that will only > > bitrot, because there is no SW that can make use of this nonsense. > > I will also add that if we make it a policy to preserve buggy > behaviours that the guest cannot be relying on, then I question > whether we should be fixing anything at all. I think we just have a different understanding of what it practically means to have behaviour "that the guest cannot be relying on", as in the examples I just described for the IIDR issue. > For example, 6.19 fixed a totally buggy behaviour where a guest > couldn't not have more than (on most HW) 4 interrupts in flight at any > given time. This was obviously totally bogus, and this was fixed > unconditionally, as legitimate guests could experience gold-platted > lock-ups. And marked with a Fixes: tag and backported to stable, one presumes? I'm confused that you think this is relevant. Can you contrive a situation where a guest actually relied on this bug and *survived*, like the situations I just explained for the IIDR issue? You can nit-pick my hypotheticals as unlikely — and they are. But if we always just YOLO it and change guest-visible behaviour on the basis that it's "unlikely" to break anyone, and there are many such changes in a given kernel deployment (e.g. from v6.6 to v6.18), then the cumulative probability of being bitten by one of those "unlikely" problems approaches 1. There's a reason we do a *huge* amount of testing of what the guest sees as we move from one kernel to the next, and back again, and endeavour to eliminate all those differences. And once the new kernel *is* deployed and won't be rolled back, of course, all new launches can get the newer behaviour (and the latest version of PSCI, etc...) [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:38 ` Marc Zyngier 2026-05-19 12:56 ` Marc Zyngier @ 2026-05-19 12:59 ` David Woodhouse 2026-05-19 13:53 ` Paolo Bonzini 1 sibling, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 12:59 UTC (permalink / raw) To: Marc Zyngier, Paolo Bonzini Cc: Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 1774 bytes --] On Tue, 2026-05-19 at 13:38 +0100, Marc Zyngier wrote: > > > But in this case there's an ID register that tells KVM if userspace > > wants the old or the new behavior, independent of whether that old > > behavior is architecturally valid or not. > > But the "old behaviour" makes no sense, and cannot be used by a guest: > > - either the guest doesn't use the alternative interrupt groups, then > it wasn't affected by the bug. That's 100% of the guests. > > - or the guest did try to use the alternative groups, and it *NEVER* > worked, as it wouldn't get any interrupt at all. What is the point > of preserving a "feature" that only results in a non-working guest? Given how long this bug existed in KVM, it's entirely feasible that some guests *check* for it and refrain from trying to use the alternative groups if the registers aren't actually writable. If such a guest boots on the new kernel and *does* use alternative groups, and then the kernel is rolled back, it breaks. Or some guest configurations which have only ever been tested under KVM could have a bug where they *rely* on the registers not being writable, and write values which are inconsistent with the rest of their configuration. Which breaks the moment those registers become writable. Even in that latter case, when we're hosting customer guests under KVM and we make a change which breaks things, we don't get to tell customers "you deserved it". And those hypothetical cases *do* happen. All of the time. There's a massive zoo of guest operating systems; not just the major players like Linux, FreeBSD and Windows but a whole bunch of embedded home-grown and network appliance kernels. Nobody is claiming that we shouldn't fix any bug ever. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:59 ` David Woodhouse @ 2026-05-19 13:53 ` Paolo Bonzini 2026-05-19 14:13 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Paolo Bonzini @ 2026-05-19 13:53 UTC (permalink / raw) To: David Woodhouse Cc: Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, May 19, 2026 at 3:00 PM David Woodhouse <dwmw2@infradead.org> wrote: > Or some guest configurations which have only ever been tested under KVM > could have a bug where they *rely* on the registers not being writable, > and write values which are inconsistent with the rest of their > configuration. Which breaks the moment those registers become writable. Yeah, just having guests that worked by utter chance - but you still don't want to break them - is the case that is most likely. Crappy code that runs only under emulation/virtualization appears with probability 1 over time. Is this likely in this specific case---probably not, honestly. Christoffer's patch dates back to 2018 (commit d53c2c29ae0d); *back then* KVM/Arm was a lot less mature, and people developing for Arm on vanilla upstream kernels have moved on from Linux 4.19. I would still lean towards accepting the code considering the limited complexity of the addition (in fact I like it more now that it uses IIDR instead of v2_groups_user_writable, but that's taste). However, there's a huge difference between setting expectations based on 2018 vs 2026 maturity, and perhaps that's why Marc overall is inclined to put this in the category of pointless bug for bug compatibility? In any case, there's no arguing over this documentation patch, which is already a good thing to know. Thanks, Paolo > And those hypothetical cases *do* happen. All of the time. There's a > massive zoo of guest operating systems; not just the major players like > Linux, FreeBSD and Windows but a whole bunch of embedded home-grown and > network appliance kernels. > > Nobody is claiming that we shouldn't fix any bug ever. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 13:53 ` Paolo Bonzini @ 2026-05-19 14:13 ` David Woodhouse 2026-05-19 21:10 ` Oliver Upton 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 14:13 UTC (permalink / raw) To: Paolo Bonzini Cc: Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 2526 bytes --] On Tue, 2026-05-19 at 15:53 +0200, Paolo Bonzini wrote: > On Tue, May 19, 2026 at 3:00 PM David Woodhouse <dwmw2@infradead.org> wrote: > > Or some guest configurations which have only ever been tested under KVM > > could have a bug where they *rely* on the registers not being writable, > > and write values which are inconsistent with the rest of their > > configuration. Which breaks the moment those registers become writable. > > Yeah, just having guests that worked by utter chance - but you still > don't want to break them - is the case that is most likely. Crappy > code that runs only under emulation/virtualization appears with > probability 1 over time. > > Is this likely in this specific case---probably not, honestly. > Christoffer's patch dates back to 2018 (commit d53c2c29ae0d); *back > then* KVM/Arm was a lot less mature, and people developing for Arm on > vanilla upstream kernels have moved on from Linux 4.19. It's not really 2018 though. EC2 is still running kernels with that older commit reverted because of the breaking change that it introduced. So the behaviour corresponding to GICD_IIDR.implementation_rev=1 is still current for *millions* of guests. I'm now finding that revert in our tree during a *later* kernel upgrade and trying to eliminate it. And sure, I have given the engineers responsible for that a very hard stare and suggested that they should have fixed it 'properly' in the first place with a patch like the one we're discussing right now. And they're looking at this thread and saying "haha no, if fixing things properly for Arm is this hard then we'll stick with the crappy approach". I do not want them to be right. I don't think any of us want that. > I would still lean towards accepting the code considering the limited > complexity of the addition (in fact I like it more now that it uses > IIDR instead of v2_groups_user_writable, but that's taste). I'm absolutely prepared to have a separate technical discussion about the v2_groups_user_writable thing for GICv2, which is the second part of that series. It seems like the right thing to do, and as far as I can tell, this code *never* worked with QEMU. But I'm not sure who even cares about GICv2 any more. I couldn't find hardware and I had to test the whole thing inside qemu-tcg. But the 'IIDR defaults to 3 but the *behaviour* doesn't match until you explicitly *set* it to 3' thing seemed so *egregiously* wrong to me, that I fixed it anyway. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 14:13 ` David Woodhouse @ 2026-05-19 21:10 ` Oliver Upton 2026-05-19 21:58 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Oliver Upton @ 2026-05-19 21:10 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, May 19, 2026 at 03:13:30PM +0100, David Woodhouse wrote: > On Tue, 2026-05-19 at 15:53 +0200, Paolo Bonzini wrote: > > On Tue, May 19, 2026 at 3:00 PM David Woodhouse <dwmw2@infradead.org> wrote: > > > Or some guest configurations which have only ever been tested under KVM > > > could have a bug where they *rely* on the registers not being writable, > > > and write values which are inconsistent with the rest of their > > > configuration. Which breaks the moment those registers become writable. > > > > Yeah, just having guests that worked by utter chance - but you still > > don't want to break them - is the case that is most likely. Crappy > > code that runs only under emulation/virtualization appears with > > probability 1 over time. > > > > Is this likely in this specific case---probably not, honestly. > > Christoffer's patch dates back to 2018 (commit d53c2c29ae0d); *back > > then* KVM/Arm was a lot less mature, and people developing for Arm on > > vanilla upstream kernels have moved on from Linux 4.19. > > It's not really 2018 though. EC2 is still running kernels with that > older commit reverted because of the breaking change that it > introduced. > > So the behaviour corresponding to GICD_IIDR.implementation_rev=1 is > still current for *millions* of guests. > > I'm now finding that revert in our tree during a *later* kernel upgrade > and trying to eliminate it. Still, as far as upstream is concerned this is damn near a decade old. Decisions that you or your peers made in the downstream doesn't change that. > And sure, I have given the engineers responsible for that a very hard > stare and suggested that they should have fixed it 'properly' in the > first place with a patch like the one we're discussing right now. > > And they're looking at this thread and saying "haha no, if fixing > things properly for Arm is this hard then we'll stick with the crappy > approach". The appropriate time to ask for accomodation was a *very* long time ago. And in the absence of clear evidence of a guest depending on the broken IGROUPR behavior, I don't see how the guest-side changes of Christoffer's series are any different from the multitude of bug fixes that we take every single release cycle. It is an unfortunate bug and I concur with Marc that it doesn't seem like the sort of thing a guest could rely upon. Because it is very much a bug fix, it should've happened without a change to the revision number. Now, the handling of GICD_IIDR itself is a separate issue. By my count, the series broke UAPI on three separate occasions. Before b489edc36169 IIDR was RAZ/WI from userspace. And of course dd6251e463d3 and d53c2c29ae0d changed the revision with no way of restoring the old value. And really, IIDR should've *never* been used as a buy in for new UAPI because it unnecessarily becomes guest visible. 49a1a2c70a7f ("KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision") is a much better example for IIDR going forward as it gates *guest-side* behavior. > I do not want them to be right. I don't think any of us want that. > > > I would still lean towards accepting the code considering the limited > > complexity of the addition (in fact I like it more now that it uses > > IIDR instead of v2_groups_user_writable, but that's taste). > > I'm absolutely prepared to have a separate technical discussion about > the v2_groups_user_writable thing for GICv2, which is the second part > of that series. > > It seems like the right thing to do, and as far as I can tell, this > code *never* worked with QEMU. But I'm not sure who even cares about > GICv2 any more. I couldn't find hardware and I had to test the whole > thing inside qemu-tcg. > > But the 'IIDR defaults to 3 but the *behaviour* doesn't match until you > explicitly *set* it to 3' thing seemed so *egregiously* wrong to me, > that I fixed it anyway. Wrong or not, this behavior is documented unambiguously. From the VGICv2 UAPI documentation: """ Userspace should set GICD_IIDR before setting any other registers (both KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure the expected behavior. Unless GICD_IIDR has been set from userspace, writes to the interrupt group registers (GICD_IGROUPR) are ignored. """ I'm not inclined to change that. As a way out of this whole mess, can we instead: - Allow userspace to set IIDR.Revision to 1 - Drop any bug emulation from the handling of IGROUPR registers - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if the VMM has written to IIDR and the revision >= 2 Thanks, Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 21:10 ` Oliver Upton @ 2026-05-19 21:58 ` David Woodhouse 2026-05-19 22:57 ` Oliver Upton 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 21:58 UTC (permalink / raw) To: Oliver Upton Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 6861 bytes --] On Tue, 2026-05-19 at 14:10 -0700, Oliver Upton wrote: > On Tue, May 19, 2026 at 03:13:30PM +0100, David Woodhouse wrote: > > On Tue, 2026-05-19 at 15:53 +0200, Paolo Bonzini wrote: > > > On Tue, May 19, 2026 at 3:00 PM David Woodhouse <dwmw2@infradead.org> wrote: > > > > Or some guest configurations which have only ever been tested under KVM > > > > could have a bug where they *rely* on the registers not being writable, > > > > and write values which are inconsistent with the rest of their > > > > configuration. Which breaks the moment those registers become writable. > > > > > > Yeah, just having guests that worked by utter chance - but you still > > > don't want to break them - is the case that is most likely. Crappy > > > code that runs only under emulation/virtualization appears with > > > probability 1 over time. > > > > > > Is this likely in this specific case---probably not, honestly. > > > Christoffer's patch dates back to 2018 (commit d53c2c29ae0d); *back > > > then* KVM/Arm was a lot less mature, and people developing for Arm on > > > vanilla upstream kernels have moved on from Linux 4.19. > > > > It's not really 2018 though. EC2 is still running kernels with that > > older commit reverted because of the breaking change that it > > introduced. > > > > So the behaviour corresponding to GICD_IIDR.implementation_rev=1 is > > still current for *millions* of guests. > > > > I'm now finding that revert in our tree during a *later* kernel upgrade > > and trying to eliminate it. > > Still, as far as upstream is concerned this is damn near a decade old. > Decisions that you or your peers made in the downstream doesn't change > that. > > > And sure, I have given the engineers responsible for that a very hard > > stare and suggested that they should have fixed it 'properly' in the > > first place with a patch like the one we're discussing right now. > > > > And they're looking at this thread and saying "haha no, if fixing > > things properly for Arm is this hard then we'll stick with the crappy > > approach". > > The appropriate time to ask for accomodation was a *very* long time ago. > > And in the absence of clear evidence of a guest depending on the broken > IGROUPR behavior, I don't see how the guest-side changes of Christoffer's > series are any different from the multitude of bug fixes that we take > every single release cycle. It is an unfortunate bug and I concur with > Marc that it doesn't seem like the sort of thing a guest could rely > upon. I find this concerning, because I've already explained this. There is a very real possibility of guests simply not *noticing* that they had bugs in this area, as it didn't *matter* what they wrote to these registers since it never worked. There is an even larger possibility of guests having worked around the original issue by *detecting* whether the registers were actually writable before choosing to use the alternative groups. And if such a guest launches on a new kernel and then needs to be rolled back to an older kernel, that will also break. > Because it is very much a bug fix, it should've happened without a > change to the revision number. No. Changing the revision number in conjunction with the guest-visible behaviour change is *absolutely* the right thing to do. > Now, the handling of GICD_IIDR itself is a separate issue. By my count, > the series broke UAPI on three separate occasions. Before b489edc36169 > IIDR was RAZ/WI from userspace. And of course dd6251e463d3 and d53c2c29ae0d > changed the revision with no way of restoring the old value. > > And really, IIDR should've *never* been used as a buy in for new UAPI > because it unnecessarily becomes guest visible. 49a1a2c70a7f ("KVM: arm64: > vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision") is > a much better example for IIDR going forward as it gates *guest-side* > behavior. Yes, 49a1a2c70a7f is the exemplar. The guest-visible behaviour changes, so we get a new IIDR revision and the ability to preserve the previous behaviour by setting IIDR to the old value. That is exactly how it should always be done. > > I do not want them to be right. I don't think any of us want that. > > > > > I would still lean towards accepting the code considering the limited > > > complexity of the addition (in fact I like it more now that it uses > > > IIDR instead of v2_groups_user_writable, but that's taste). > > > > I'm absolutely prepared to have a separate technical discussion about > > the v2_groups_user_writable thing for GICv2, which is the second part > > of that series. > > > > It seems like the right thing to do, and as far as I can tell, this > > code *never* worked with QEMU. But I'm not sure who even cares about > > GICv2 any more. I couldn't find hardware and I had to test the whole > > thing inside qemu-tcg. > > > > But the 'IIDR defaults to 3 but the *behaviour* doesn't match until you > > explicitly *set* it to 3' thing seemed so *egregiously* wrong to me, > > that I fixed it anyway. > > Wrong or not, this behavior is documented unambiguously. From the VGICv2 > UAPI documentation: > > """ > Userspace should set GICD_IIDR before setting any other registers (both > KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure > the expected behavior. Unless GICD_IIDR has been set from userspace, writes > to the interrupt group registers (GICD_IGROUPR) are ignored. > """ > > I'm not inclined to change that. That'll all very well... but as far as I can tell, QEMU *doesn't* set GICD_IIDR, so it still gets the bizarre behaviour where the *guest* can write the registers, but userspace can't. So it looks like it'll work except migration will fail. Am I missing something? But honestly, I don't care one iota about GICv2; I was only trying to do the cleanup while I was there. Feel free to drop that part entirely. > As a way out of this whole mess, can we > instead: > > - Allow userspace to set IIDR.Revision to 1 > > - Drop any bug emulation from the handling of IGROUPR registers It doesn't make sense to allow setting IIDR.Revision to 1 *without* the one-liner that actually implements the corresponding behaviour change in the IGROUPR registers. And as explained at least twice now, it's the behaviour change that's *important* here. The fact that it's a long-standing bug in KVM which downstream has been working around for a long time doesn't matter. The unconditional behavioural change *is* a bug and we should fix it. > - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if > the VMM has written to IIDR and the revision >= 2 That already *is* a special case, right? And you'd rather leave it as it is? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 21:58 ` David Woodhouse @ 2026-05-19 22:57 ` Oliver Upton 2026-05-19 23:33 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Oliver Upton @ 2026-05-19 22:57 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Tue, May 19, 2026 at 10:58:05PM +0100, David Woodhouse wrote: > On Tue, 2026-05-19 at 14:10 -0700, Oliver Upton wrote: > > And in the absence of clear evidence of a guest depending on the broken > > IGROUPR behavior, I don't see how the guest-side changes of Christoffer's > > series are any different from the multitude of bug fixes that we take > > every single release cycle. It is an unfortunate bug and I concur with > > Marc that it doesn't seem like the sort of thing a guest could rely > > upon. > > I find this concerning, because I've already explained this. > > There is a very real possibility of guests simply not *noticing* that > they had bugs in this area, as it didn't *matter* what they wrote to > these registers since it never worked. > > There is an even larger possibility of guests having worked around the > original issue by *detecting* whether the registers were actually > writable before choosing to use the alternative groups. And if such a > guest launches on a new kernel and then needs to be rolled back to an > older kernel, that will also break. The onus is on you to substantiate this claim. I would imagine after carrying the revert for so long that there must be at least one example of such a guest? What ifs and maybes do not meet the bar, in my opinion, for preserving bug emulation in KVM. Of course there could be a little flexibility with that but we need to have some way of discriminating between bug fixes and genuine guest expectations around the behavior of virtual hardware. > > Wrong or not, this behavior is documented unambiguously. From the VGICv2 > > UAPI documentation: > > > > """ > > Userspace should set GICD_IIDR before setting any other registers (both > > KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure > > the expected behavior. Unless GICD_IIDR has been set from userspace, writes > > to the interrupt group registers (GICD_IGROUPR) are ignored. > > """ > > > > I'm not inclined to change that. > > That'll all very well... but as far as I can tell, QEMU *doesn't* set > GICD_IIDR, so it still gets the bizarre behaviour where the *guest* can > write the registers, but userspace can't. So it looks like it'll work > except migration will fail. Am I missing something? That's exactly it, and why I said tying up UAPI opt-in with guest-visible registers is a really bad idea. > But honestly, I don't care one iota about GICv2; I was only trying to > do the cleanup while I was there. Feel free to drop that part entirely. > > > As a way out of this whole mess, can we > > instead: > > > > - Allow userspace to set IIDR.Revision to 1 > > > > - Drop any bug emulation from the handling of IGROUPR registers > > It doesn't make sense to allow setting IIDR.Revision to 1 *without* the > one-liner that actually implements the corresponding behaviour change > in the IGROUPR registers. As I described earlier, this whole IIDR crap inarguably broke UAPI and obviously normal guest behavior (i.e. reading the register). At minimum we need to permit previously-valid values for IIDR, even if they carry no implied behaviors. > And as explained at least twice now, it's the > behaviour change that's *important* here. > > The fact that it's a long-standing bug in KVM which downstream has been > working around for a long time doesn't matter. The unconditional > behavioural change *is* a bug and we should fix it. That is the nature of a bug fix. If you can provide some concrete evidence of a guest depending on the RAZ/WI behavior then I agree we need to preserve the old behavior. Otherwise I see this as a matter of principle in how we do bug fixes to KVM. Even if upstream took the strictest possible stance towards behavior changes we will invariably fail to account for some minutia. > > - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if > > the VMM has written to IIDR and the revision >= 2 > > That already *is* a special case, right? And you'd rather leave it as it is? Left as documented, yes. With the exception that revision == 1 writes not be considered opt-in to restorable IGROUPR. Thanks, Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 22:57 ` Oliver Upton @ 2026-05-19 23:33 ` David Woodhouse 2026-05-20 17:47 ` Oliver Upton 0 siblings, 1 reply; 29+ messages in thread From: David Woodhouse @ 2026-05-19 23:33 UTC (permalink / raw) To: Oliver Upton Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 6032 bytes --] On Tue, 2026-05-19 at 15:57 -0700, Oliver Upton wrote: > On Tue, May 19, 2026 at 10:58:05PM +0100, David Woodhouse wrote: > > On Tue, 2026-05-19 at 14:10 -0700, Oliver Upton wrote: > > > And in the absence of clear evidence of a guest depending on the broken > > > IGROUPR behavior, I don't see how the guest-side changes of Christoffer's > > > series are any different from the multitude of bug fixes that we take > > > every single release cycle. It is an unfortunate bug and I concur with > > > Marc that it doesn't seem like the sort of thing a guest could rely > > > upon. > > > > I find this concerning, because I've already explained this. > > > > There is a very real possibility of guests simply not *noticing* that > > they had bugs in this area, as it didn't *matter* what they wrote to > > these registers since it never worked. > > > > There is an even larger possibility of guests having worked around the > > original issue by *detecting* whether the registers were actually > > writable before choosing to use the alternative groups. And if such a > > guest launches on a new kernel and then needs to be rolled back to an > > older kernel, that will also break. > > The onus is on you to substantiate this claim. I would imagine after > carrying the revert for so long that there must be at least one example > of such a guest? What? No. We have *avoided* having the bug, specifically so that we do not find out the consequences of the bug. > What ifs and maybes do not meet the bar, in my opinion, for preserving > bug emulation in KVM. Of course there could be a little flexibility with > that but we need to have some way of discriminating between bug fixes > and genuine guest expectations around the behavior of virtual hardware. I believe you have this completely backwards. The expectation of KVM is that do not change guest visible behaviour if there's any reasonable chance that it might cause problems. A stable and mature platform doesn't get to play in its ivory tower and randomly inflict breakage on guests because they "deserve it". I've literally explained the potential failure modes, including the one on rollback if a guest *does* change the group configuration and then needs to be rolled back to the older kernel that doesn't support it. And yes, "ifs and maybes" absolutely *are* the quality bar expected by KVM because — again, as already explained more than once — as we accumulate a bunch of such "unlikely" breakages in a fleet upgrade from, say, 6.1 to 6.12, the likelihood of *one* of them actually turning out to afflict *one* of the zoo of guest operating systems approaches 1. We don't get to just YOLO it. > > > Wrong or not, this behavior is documented unambiguously. From the VGICv2 > > > UAPI documentation: > > > > > > """ > > > Userspace should set GICD_IIDR before setting any other registers (both > > > KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure > > > the expected behavior. Unless GICD_IIDR has been set from userspace, writes > > > to the interrupt group registers (GICD_IGROUPR) are ignored. > > > """ > > > > > > I'm not inclined to change that. > > > > That'll all very well... but as far as I can tell, QEMU *doesn't* set > > GICD_IIDR, so it still gets the bizarre behaviour where the *guest* can > > write the registers, but userspace can't. So it looks like it'll work > > except migration will fail. Am I missing something? > > That's exactly it, and why I said tying up UAPI opt-in with > guest-visible registers is a really bad idea. > > > But honestly, I don't care one iota about GICv2; I was only trying to > > do the cleanup while I was there. Feel free to drop that part entirely. > > > > > As a way out of this whole mess, can we > > > instead: > > > > > > - Allow userspace to set IIDR.Revision to 1 > > > > > > - Drop any bug emulation from the handling of IGROUPR registers > > > > It doesn't make sense to allow setting IIDR.Revision to 1 *without* the > > one-liner that actually implements the corresponding behaviour change > > in the IGROUPR registers. > > As I described earlier, this whole IIDR crap inarguably broke UAPI and > obviously normal guest behavior (i.e. reading the register). At minimum > we need to permit previously-valid values for IIDR, even if they carry > no implied behaviors. But the whole *point* of IIDR is to preserve the behaviour. To set the IIDR and *not* have the corresponding behaviour is insanity. > > And as explained at least twice now, it's the > > behaviour change that's *important* here. > > > > The fact that it's a long-standing bug in KVM which downstream has been > > working around for a long time doesn't matter. The unconditional > > behavioural change *is* a bug and we should fix it. > > That is the nature of a bug fix. If you can provide some concrete > evidence of a guest depending on the RAZ/WI behavior then I agree we > need to preserve the old behavior. > > Otherwise I see this as a matter of principle in how we do bug fixes to > KVM. Even if upstream took the strictest possible stance towards behavior > changes we will invariably fail to account for some minutia. No. Don't pretend that this is hard. KVM on x86 has been quietly getting this right for years. Yes, there is sometimes *some* subjectivity around it, and it's sometimes reasonable to just unilaterally change behaviours. This is not, and was not, once of those cases. > > > - Special-case the stupid GICv2 UAPI where IGROUPR are only writable if > > > the VMM has written to IIDR and the revision >= 2 > > > > That already *is* a special case, right? And you'd rather leave it as it is? > > Left as documented, yes. With the exception that revision == 1 writes > not be considered opt-in to restorable IGROUPR. Don't do that. Just leave it broken, with QEMU not even working. I'm beyond caring about GICv2 now. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 23:33 ` David Woodhouse @ 2026-05-20 17:47 ` Oliver Upton 2026-05-20 18:29 ` David Woodhouse 0 siblings, 1 reply; 29+ messages in thread From: Oliver Upton @ 2026-05-20 17:47 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest On Wed, May 20, 2026 at 12:33:52AM +0100, David Woodhouse wrote: > On Tue, 2026-05-19 at 15:57 -0700, Oliver Upton wrote: > > What ifs and maybes do not meet the bar, in my opinion, for preserving > > bug emulation in KVM. Of course there could be a little flexibility with > > that but we need to have some way of discriminating between bug fixes > > and genuine guest expectations around the behavior of virtual hardware. > > I believe you have this completely backwards. No, I really don't. Leaving every bugfix that could _possibly_ have a guest-visible impact subject to drive-by scrutiny many years after the dust has settled is not an acceptable working dynamic. Especially since it would appear that the rest of the ecosystem has long since moved on from this particular issue. If this matters to you so deeply then please, be part of the solution instead. You may find that reviewing patches leads to better outcomes than getting belligerent with the arm64 folks every time you guys decide to rebase your kernel. Hell, hypotheticals actually have a lot more weight in the context of a review. And if your testing is extensive enough to catch these sort of subtleties, don't you think it's better done against mainline? Maybe it's just me but I am left feeling disappointed that we all haven't found a productive way of working together. I've tried to bridge the gap here; we obviously need to do something that at least fixes the UAPI breakage. Although apparently we don't even care to meet that low of bar. > A stable and mature platform doesn't get to play in its ivory tower and > randomly inflict breakage on guests because they "deserve it". Really? Aren't you asking for us to emulate something completely broken for you? Thanks, Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-20 17:47 ` Oliver Upton @ 2026-05-20 18:29 ` David Woodhouse 0 siblings, 0 replies; 29+ messages in thread From: David Woodhouse @ 2026-05-20 18:29 UTC (permalink / raw) To: Oliver Upton Cc: Paolo Bonzini, Marc Zyngier, Will Deacon, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 4141 bytes --] On Wed, 2026-05-20 at 10:47 -0700, Oliver Upton wrote: > On Wed, May 20, 2026 at 12:33:52AM +0100, David Woodhouse wrote: > > On Tue, 2026-05-19 at 15:57 -0700, Oliver Upton wrote: > > > What ifs and maybes do not meet the bar, in my opinion, for preserving > > > bug emulation in KVM. Of course there could be a little flexibility with > > > that but we need to have some way of discriminating between bug fixes > > > and genuine guest expectations around the behavior of virtual hardware. > > > > I believe you have this completely backwards. > > No, I really don't. > > Leaving every bugfix that could _possibly_ have a guest-visible impact > subject to drive-by scrutiny many years after the dust has settled is > not an acceptable working dynamic. Especially since it would appear > that the rest of the ecosystem has long since moved on from this > particular issue. That's reductio ad absurdum. I can continue to work around this one internally, sure. But I'm also concerned about the general case because not only did you refuse it, but you *also* said that this change in guest-visible behaviour "should've happened without a change to the revision number". Which seems to indicate that not only are you being randomly obstructive about a one-line fix, you *also* don't actually understand the general concept of what is expected of KVM, which this Documentation patch is intending to clarify. It was *right* to bump the IIDR from 1 to 2 when this guest visible behaviour was changed. The only problem was not letting userspace select the old revision. I'm really concerned that we now appear to have a regression of understanding of even the part we previously *did* get right. > If this matters to you so deeply then please, be part of the solution > instead. You may find that reviewing patches leads to better outcomes > than getting belligerent with the arm64 folks every time you guys > decide to rebase your kernel. Hell, hypotheticals actually have a lot > more weight in the context of a review. And if your testing is extensive > enough to catch these sort of subtleties, don't you think it's better > done against mainline? Yes. Definitely. That's why my series with the fixes is more *test* than actual fix, giving a nice simple framework for any such changes in future. It checks that GICR_CTLR_IR|GICR_CTLR_CES are visible only with IIDR.rev=3 for example. And we're making progress on the amount of downstream crap, but it doesn't help when we seem to have an impedance mismatch on the very question of what it means to support customers on KVM at scale. This thread is not exactly encouraging my engineers to poke their heads above the parapet. > Maybe it's just me but I am left feeling disappointed that we all > haven't found a productive way of working together. I've tried to bridge > the gap here; we obviously need to do something that at least fixes the > UAPI breakage. Although apparently we don't even care to meet that low > of bar. > > > A stable and mature platform doesn't get to play in its ivory tower and > > randomly inflict breakage on guests because they "deserve it". > > Really? Aren't you asking for us to emulate something completely broken > for you? No. I'm asking for a path to be able to *fix* it. As things stand, if I just drop these patches and launch guests on a new kernel, those guests will see writable IGROUPR registers and may try to use them. And then if I have to roll *back* a kernel deployment, those guests may lose interrupts. The *only* time a guest-visible feature (or bugfix, nobody cares about the difference outside the ivory tower) can be enabled is when the kernel deployment is finished and stable and *won't* be rolled back. And *then* new launches (and reboots) can get it. And one day, when the last guest which was launched *without* it is finally rebooted and sees the new model, *then* maybe we no longer need that one line if() statement to support IIDR version 1. 2018 was basically *yesterday*. And I'm kind of scared that I even have to explain it. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations 2026-05-19 12:13 ` Paolo Bonzini 2026-05-19 12:38 ` Marc Zyngier @ 2026-05-19 12:42 ` David Woodhouse 1 sibling, 0 replies; 29+ messages in thread From: David Woodhouse @ 2026-05-19 12:42 UTC (permalink / raw) To: Paolo Bonzini Cc: Will Deacon, Marc Zyngier, Jonathan Corbet, Shuah Khan, kvm, Linux Doc Mailing List, Kernel Mailing List, Linux, Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest [-- Attachment #1: Type: text/plain, Size: 2201 bytes --] On Tue, 2026-05-19 at 14:13 +0200, Paolo Bonzini wrote: > I admit that my knowledge of Arm is really limited, and I do not > understand which IIDR values have architecturally allowed behaviors > and which (if any) were made up by KVM; but even if I cannot honestly > remark on the code or even the approach, a compatibility knob is the > right thing to have. That's a userspace API design matter, not an Arm > or GIC matter. To be clear: the "IID" in IIDR is "implementer identification"; the implementer in this case being KVM. The revision field in the IIDR literally *is* the compatibility knob. The values are indeed 'made up by KVM', and have been correctly bumped from 1 to 2 to 3 as guest-visible behaviour has changed. The only problem is that there's no way to set it *back* to 1 again, on a newer kernel (which defaults to 3). We can only set it to 2 or 3. The patch which is causing all this fuss is little more than a one- liner which allows userspace to set it to 1 again, and a second line to actually *honour* the corresponding behaviour that certain registers aren't writable. This is the only way to retain that historical behaviour so that we don't have to change it underneath running guests on a kernel upgrade (or worse, rip the new behaviour *away* from newly-launched guests, if we have to roll back to the old kernel after launching on the new one). > I will certainly take this patch, but I won't override Marc. However > I'd like to better understand his point of view, because right now I > just don't get it. Indeed. Like you, I just don't get it. I cannot see any reason *not* to take the fix, and I am *trying* (with limited success) to limit the expression of my frustration to the specific technical issue at hand. Marc, I have a huge amount of respect for you, and I'm painfully aware that I risk burning bridges here by pressing the issue. But on this specific topic I respectfully believe that you have made the wrong decision, and I beg you to reconsider. We *need* to be able to upgrade without changing behaviour for guests. Even if the old behaviour was "wrong" according to the architecture specification. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2026-05-20 18:30 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-11 8:57 [PATCH] Documentation: KVM: Document guest-visible compatibility expectations David Woodhouse 2026-05-11 15:14 ` Paolo Bonzini 2026-05-11 16:38 ` David Woodhouse 2026-05-11 16:56 ` Paolo Bonzini 2026-05-11 17:53 ` David Woodhouse 2026-05-13 8:42 ` Marc Zyngier 2026-05-13 9:24 ` David Woodhouse 2026-05-13 12:43 ` Paolo Bonzini 2026-05-13 13:03 ` Eric Auger 2026-05-13 13:57 ` David Woodhouse 2026-05-13 16:24 ` Paolo Bonzini 2026-05-13 18:26 ` David Woodhouse 2026-05-19 10:41 ` David Woodhouse 2026-05-19 11:11 ` Will Deacon 2026-05-19 11:44 ` David Woodhouse 2026-05-19 12:13 ` Paolo Bonzini 2026-05-19 12:38 ` Marc Zyngier 2026-05-19 12:56 ` Marc Zyngier 2026-05-19 13:24 ` David Woodhouse 2026-05-19 12:59 ` David Woodhouse 2026-05-19 13:53 ` Paolo Bonzini 2026-05-19 14:13 ` David Woodhouse 2026-05-19 21:10 ` Oliver Upton 2026-05-19 21:58 ` David Woodhouse 2026-05-19 22:57 ` Oliver Upton 2026-05-19 23:33 ` David Woodhouse 2026-05-20 17:47 ` Oliver Upton 2026-05-20 18:29 ` David Woodhouse 2026-05-19 12:42 ` David Woodhouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox