* [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index @ 2025-02-01 1:13 Sean Christopherson 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson ` (5 more replies) 0 siblings, 6 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Address a syzkaller splat by restricting the Xen hypercall MSR index to the de facto standard synthetic range, 0x40000000 - 0x4fffffff. This obviously has the potential to break userspace, but I'm fairly confident it'll be fine (knock wood), and doing nothing is not an option as letting userspace redirect any WRMSR is at best completely broken, and at worst could be used to exploit paths in KVM that directly write hardcoded MSRs. Patches 2-5 are tangentially related cleanups. Sean Christopherson (5): KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/x86.c | 4 ++-- arch/x86/kvm/xen.c | 28 ++++++++++++++++++---------- arch/x86/kvm/xen.h | 17 +++++++++++++++-- 4 files changed, 37 insertions(+), 16 deletions(-) base-commit: eb723766b1030a23c38adf2348b7c3d1409d11f0 -- 2.48.1.362.g079036d154-goog ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson @ 2025-02-01 1:13 ` Sean Christopherson 2025-02-03 9:09 ` Paul Durrant ` (2 more replies) 2025-02-01 1:13 ` [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR Sean Christopherson ` (4 subsequent siblings) 5 siblings, 3 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Reject userspace attempts to set the Xen hypercall page MSR to an index outside of the "standard" virtualization range [0x40000000, 0x4fffffff], as KVM is not equipped to handle collisions with real MSRs, e.g. KVM doesn't update MSR interception, conflicts with VMCS/VMCB fields, special case writes in KVM, etc. Allowing userspace to redirect any MSR write can also be used to attack the kernel, as kvm_xen_write_hypercall_page() takes multiple locks and writes to guest memory. E.g. if userspace sets the MSR to MSR_IA32_XSS, KVM's write to MSR_IA32_XSS during vCPU creation will trigger an SRCU violation due to writing guest memory: ============================= WARNING: suspicious RCU usage 6.13.0-rc3 ----------------------------- include/linux/kvm_host.h:1046 suspicious rcu_dereference_check() usage! stack backtrace: CPU: 6 UID: 1000 PID: 1101 Comm: repro Not tainted 6.13.0-rc3 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x7f/0x90 lockdep_rcu_suspicious+0x176/0x1c0 kvm_vcpu_gfn_to_memslot+0x259/0x280 kvm_vcpu_write_guest+0x3a/0xa0 kvm_xen_write_hypercall_page+0x268/0x300 kvm_set_msr_common+0xc44/0x1940 vmx_set_msr+0x9db/0x1fc0 kvm_vcpu_reset+0x857/0xb50 kvm_arch_vcpu_create+0x37e/0x4d0 kvm_vm_ioctl+0x669/0x2100 __x64_sys_ioctl+0xc1/0xf0 do_syscall_64+0xc5/0x210 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7feda371b539 While the MSR index isn't strictly ABI, i.e. can theoretically float to any value, in practice no known VMM sets the MSR index to anything other than 0x40000000 or 0x40000200. Reported-by: syzbot+cdeaeec70992eca2d920@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/679258d4.050a0220.2eae65.000a.GAE@google.com Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Paul Durrant <paul@xen.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/xen.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index a909b817b9c0..35ecafc410f0 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -1324,6 +1324,14 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc) xhc->blob_size_32 || xhc->blob_size_64)) return -EINVAL; + /* + * Restrict the MSR to the range that is unofficially reserved for + * synthetic, virtualization-defined MSRs, e.g. to prevent confusing + * KVM by colliding with a real MSR that requires special handling. + */ + if (xhc->msr && (xhc->msr < 0x40000000 || xhc->msr > 0x4fffffff)) + return -EINVAL; + mutex_lock(&kvm->arch.xen.xen_lock); if (xhc->msr && !kvm->arch.xen_hvm_config.msr) -- 2.48.1.362.g079036d154-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson @ 2025-02-03 9:09 ` Paul Durrant 2025-02-05 9:27 ` David Woodhouse 2025-02-06 16:51 ` David Woodhouse 2 siblings, 0 replies; 28+ messages in thread From: Paul Durrant @ 2025-02-03 9:09 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse On 01/02/2025 01:13, Sean Christopherson wrote: > Reject userspace attempts to set the Xen hypercall page MSR to an index > outside of the "standard" virtualization range [0x40000000, 0x4fffffff], > as KVM is not equipped to handle collisions with real MSRs, e.g. KVM > doesn't update MSR interception, conflicts with VMCS/VMCB fields, special > case writes in KVM, etc. > > Allowing userspace to redirect any MSR write can also be used to attack > the kernel, as kvm_xen_write_hypercall_page() takes multiple locks and > writes to guest memory. E.g. if userspace sets the MSR to MSR_IA32_XSS, > KVM's write to MSR_IA32_XSS during vCPU creation will trigger an SRCU > violation due to writing guest memory: > > ============================= > WARNING: suspicious RCU usage > 6.13.0-rc3 > ----------------------------- > include/linux/kvm_host.h:1046 suspicious rcu_dereference_check() usage! > > stack backtrace: > CPU: 6 UID: 1000 PID: 1101 Comm: repro Not tainted 6.13.0-rc3 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > Call Trace: > <TASK> > dump_stack_lvl+0x7f/0x90 > lockdep_rcu_suspicious+0x176/0x1c0 > kvm_vcpu_gfn_to_memslot+0x259/0x280 > kvm_vcpu_write_guest+0x3a/0xa0 > kvm_xen_write_hypercall_page+0x268/0x300 > kvm_set_msr_common+0xc44/0x1940 > vmx_set_msr+0x9db/0x1fc0 > kvm_vcpu_reset+0x857/0xb50 > kvm_arch_vcpu_create+0x37e/0x4d0 > kvm_vm_ioctl+0x669/0x2100 > __x64_sys_ioctl+0xc1/0xf0 > do_syscall_64+0xc5/0x210 > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > RIP: 0033:0x7feda371b539 > > While the MSR index isn't strictly ABI, i.e. can theoretically float to > any value, in practice no known VMM sets the MSR index to anything other > than 0x40000000 or 0x40000200. > > Reported-by: syzbot+cdeaeec70992eca2d920@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/679258d4.050a0220.2eae65.000a.GAE@google.com > Cc: Joao Martins <joao.m.martins@oracle.com> > Cc: Paul Durrant <paul@xen.org> > Cc: David Woodhouse <dwmw@amazon.co.uk> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/xen.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson 2025-02-03 9:09 ` Paul Durrant @ 2025-02-05 9:27 ` David Woodhouse 2025-02-05 15:06 ` Sean Christopherson 2025-02-06 16:51 ` David Woodhouse 2 siblings, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-05 9:27 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 1762 bytes --] On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -1324,6 +1324,14 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc) > xhc->blob_size_32 || xhc->blob_size_64)) > return -EINVAL; > > + /* > + * Restrict the MSR to the range that is unofficially reserved for > + * synthetic, virtualization-defined MSRs, e.g. to prevent confusing > + * KVM by colliding with a real MSR that requires special handling. > + */ > + if (xhc->msr && (xhc->msr < 0x40000000 || xhc->msr > 0x4fffffff)) > + return -EINVAL; > + > mutex_lock(&kvm->arch.xen.xen_lock); > > if (xhc->msr && !kvm->arch.xen_hvm_config.msr) I'd prefer to see #defines for those magic values. Especially as there is a corresponding requirement that they never be set from host context (which is where the potential locking issues come in). Which train of thought leads me to ponder this as an alternative (or additional) solution: --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u32 msr = msr_info->index; u64 data = msr_info->data; - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) + /* + * Do not allow host-initiated writes to trigger the Xen hypercall + * page setup; it could incur locking paths which are not expected + * if userspace sets the MSR in an unusual location. + */ + if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr && + !msr_info->host_initiated) return kvm_xen_write_hypercall_page(vcpu, data); switch (msr) { [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 9:27 ` David Woodhouse @ 2025-02-05 15:06 ` Sean Christopherson 2025-02-05 15:26 ` David Woodhouse 0 siblings, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2025-02-05 15:06 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On Wed, Feb 05, 2025, David Woodhouse wrote: > On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > > --- a/arch/x86/kvm/xen.c > > +++ b/arch/x86/kvm/xen.c > > @@ -1324,6 +1324,14 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc) > > xhc->blob_size_32 || xhc->blob_size_64)) > > return -EINVAL; > > > > + /* > > + * Restrict the MSR to the range that is unofficially reserved for > > + * synthetic, virtualization-defined MSRs, e.g. to prevent confusing > > + * KVM by colliding with a real MSR that requires special handling. > > + */ > > + if (xhc->msr && (xhc->msr < 0x40000000 || xhc->msr > 0x4fffffff)) > > + return -EINVAL; > > + > > mutex_lock(&kvm->arch.xen.xen_lock); > > > > if (xhc->msr && !kvm->arch.xen_hvm_config.msr) > > I'd prefer to see #defines for those magic values. Can do. Hmm, and since this would be visible to userspace, arguably the #defines should go in arch/x86/include/uapi/asm/kvm.h > Especially as there is a corresponding requirement that they never be set > from host context (which is where the potential locking issues come in). > Which train of thought leads me to ponder this as an alternative (or > additional) solution: > > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > u32 msr = msr_info->index; > u64 data = msr_info->data; > > - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) > + /* > + * Do not allow host-initiated writes to trigger the Xen hypercall > + * page setup; it could incur locking paths which are not expected > + * if userspace sets the MSR in an unusual location. That's just as likely to break userspace. Doing a save/restore on the MSR doesn't make a whole lot of sense since it's effectively a "command" MSR, but IMO it's not any less likely than userspace putting the MSR index outside of the synthetic range. Side topic, upstream QEMU doesn't even appear to put the MSR at the Hyper-V address. It tells the guest that's where the MSR is located, but the config passed to KVM still uses the default. /* Hypercall MSR base address */ if (hyperv_enabled(cpu)) { c->ebx = XEN_HYPERCALL_MSR_HYPERV; kvm_xen_init(cs->kvm_state, c->ebx); } else { c->ebx = XEN_HYPERCALL_MSR; } ... /* hyperv_enabled() doesn't work yet. */ uint32_t msr = XEN_HYPERCALL_MSR; ret = kvm_xen_init(s, msr); if (ret < 0) { return ret; } Userspace breakage aside, disallowng host writes would fix the immediate issue, and I think would mitigate all concerns with putting the host at risk. But it's not enough to actually make an overlapping MSR index work. E.g. if the MSR is passed through to the guest, the write will go through to the hardware MSR, unless the WRMSR happens to be emulated. I really don't want to broadly support redirecting any MSR, because to truly go down that path we'd need to deal with x2APIC, EFER, and other MSRs that have special treatment and meaning. While KVM's stance is usually that a misconfigured vCPU model is userspace's problem, in this case I don't see any value in letting userspace be stupid. It can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's a crazy use case I'm overlooking, there's no sane reason for userspace to put the index in outside of the synthetic range (whereas defining seemingly nonsensical CPUID feature bits is useful for testing purposes, implementing support in userspace, etc). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 15:06 ` Sean Christopherson @ 2025-02-05 15:26 ` David Woodhouse 2025-02-05 15:51 ` Sean Christopherson 0 siblings, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-05 15:26 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 6059 bytes --] On Wed, 2025-02-05 at 07:06 -0800, Sean Christopherson wrote: > On Wed, Feb 05, 2025, David Woodhouse wrote: > > On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > > > --- a/arch/x86/kvm/xen.c > > > +++ b/arch/x86/kvm/xen.c > > > @@ -1324,6 +1324,14 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc) > > > xhc->blob_size_32 || xhc->blob_size_64)) > > > return -EINVAL; > > > > > > + /* > > > + * Restrict the MSR to the range that is unofficially reserved for > > > + * synthetic, virtualization-defined MSRs, e.g. to prevent confusing > > > + * KVM by colliding with a real MSR that requires special handling. > > > + */ > > > + if (xhc->msr && (xhc->msr < 0x40000000 || xhc->msr > 0x4fffffff)) > > > + return -EINVAL; > > > + > > > mutex_lock(&kvm->arch.xen.xen_lock); > > > > > > if (xhc->msr && !kvm->arch.xen_hvm_config.msr) > > > > I'd prefer to see #defines for those magic values. > > Can do. Hmm, and since this would be visible to userspace, arguably the #defines > should go in arch/x86/include/uapi/asm/kvm.h Thanks. > > Especially as there is a corresponding requirement that they never be set > > from host context (which is where the potential locking issues come in). > > Which train of thought leads me to ponder this as an alternative (or > > additional) solution: > > > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > u32 msr = msr_info->index; > > u64 data = msr_info->data; > > > > - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) > > + /* > > + * Do not allow host-initiated writes to trigger the Xen hypercall > > + * page setup; it could incur locking paths which are not expected > > + * if userspace sets the MSR in an unusual location. > > That's just as likely to break userspace. Doing a save/restore on the MSR doesn't > make a whole lot of sense since it's effectively a "command" MSR, but IMO it's not > any less likely than userspace putting the MSR index outside of the synthetic range. Save/restore on the MSR makes no sense. It's a write-only MSR; writing to it has no effect *other* than populating the target page. In KVM we don't implement reading from it at all; I don't think Xen does either? And even if it was readable and would rather pointlessly return the last value written to it, save/restore arguably shouldn't actually trigger the guest memory to be overwritten again. The hypercall page should only be populated when the *guest* writes the MSR. With the recent elimination of the hypercall page from Linux Xen guests, we've suggested that Linux should still set up the hypercall page early (as it *does* have the side-effect of letting Xen know that the guest is 64-bit). And then just free the page without ever using it. We absolutely would not want a save/restore to scribble on that page again. I'm absolutely not worried about breaking userspace with such a change to make the hypercall page MSR only work when !host_initiated. In fact I think it's probably the right thing to do *anyway*. If userspace wants to write to guest memory, it can do that anyway; it doesn't need to ask the *kernel* to do it. > Side topic, upstream QEMU doesn't even appear to put the MSR at the Hyper-V > address. It tells the guest that's where the MSR is located, but the config > passed to KVM still uses the default. > > /* Hypercall MSR base address */ > if (hyperv_enabled(cpu)) { > c->ebx = XEN_HYPERCALL_MSR_HYPERV; > kvm_xen_init(cs->kvm_state, c->ebx); > } else { > c->ebx = XEN_HYPERCALL_MSR; > } > > ... > > /* hyperv_enabled() doesn't work yet. */ > uint32_t msr = XEN_HYPERCALL_MSR; > ret = kvm_xen_init(s, msr); > if (ret < 0) { > return ret; > } > Those two happen in reverse chronological order, don't they? And in the lower one the comment tells you that hyperv_enabled() doesn't work yet. When the higher one is called later, it calls kvm_xen_init() *again* to put the MSR in the right place. It could be prettier, but I don't think it's broken, is it? > Userspace breakage aside, disallowng host writes would fix the immediate issue, > and I think would mitigate all concerns with putting the host at risk. But it's > not enough to actually make an overlapping MSR index work. E.g. if the MSR is > passed through to the guest, the write will go through to the hardware MSR, unless > the WRMSR happens to be emulated. > > I really don't want to broadly support redirecting any MSR, because to truly go > down that path we'd need to deal with x2APIC, EFER, and other MSRs that have > special treatment and meaning. > > While KVM's stance is usually that a misconfigured vCPU model is userspace's > problem, in this case I don't see any value in letting userspace be stupid. It > can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's > a crazy use case I'm overlooking, there's no sane reason for userspace to put the > index in outside of the synthetic range (whereas defining seemingly nonsensical > CPUID feature bits is useful for testing purposes, implementing support in > userspace, etc). Right, I think we should do *both*. Blocking host writes solves the issue of locking problems with the hypercall page setup. All it would take for that issue to recur is for us (or Microsoft) to invent a new MSR in the synthetic range which is also written on vCPU init/reset. And then the sanity check on where the VMM puts the Xen MSR doesn't save us. But yes, we should *also* do that sanity check. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 15:26 ` David Woodhouse @ 2025-02-05 15:51 ` Sean Christopherson 2025-02-05 16:18 ` David Woodhouse 2025-02-06 9:18 ` David Woodhouse 0 siblings, 2 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-05 15:51 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On Wed, Feb 05, 2025, David Woodhouse wrote: > On Wed, 2025-02-05 at 07:06 -0800, Sean Christopherson wrote: > > On Wed, Feb 05, 2025, David Woodhouse wrote: > > > Especially as there is a corresponding requirement that they never be set > > > from host context (which is where the potential locking issues come in). > > > Which train of thought leads me to ponder this as an alternative (or > > > additional) solution: > > > > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > u32 msr = msr_info->index; > > > u64 data = msr_info->data; > > > > > > - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) > > > + /* > > > + * Do not allow host-initiated writes to trigger the Xen hypercall > > > + * page setup; it could incur locking paths which are not expected > > > + * if userspace sets the MSR in an unusual location. > > > > That's just as likely to break userspace. Doing a save/restore on the MSR doesn't > > make a whole lot of sense since it's effectively a "command" MSR, but IMO it's not > > any less likely than userspace putting the MSR index outside of the synthetic range. > > Save/restore on the MSR makes no sense. It's a write-only MSR; writing > to it has no effect *other* than populating the target page. In KVM we > don't implement reading from it at all; I don't think Xen does either? Hah, that's another KVM bug, technically. KVM relies on the MSR not being handled in order to generate the write-only semantics, but if the MSR index collides with an MSR that KVM emulates, then the MSR would be readable. KVM supports Hyper-V's HV_X64_MSR_TSC_INVARIANT_CONTROL (0x40000118), so just a few hundred more MSRs until fireworks :-) If we want to close that hole, it'd be easy enough to add a check in kvm_get_msr_common(). > Those two happen in reverse chronological order, don't they? And in the > lower one the comment tells you that hyperv_enabled() doesn't work yet. > When the higher one is called later, it calls kvm_xen_init() *again* to > put the MSR in the right place. > > It could be prettier, but I don't think it's broken, is it? Gah, -ENOCOFFEE. > > Userspace breakage aside, disallowng host writes would fix the immediate issue, > > and I think would mitigate all concerns with putting the host at risk. But it's > > not enough to actually make an overlapping MSR index work. E.g. if the MSR is > > passed through to the guest, the write will go through to the hardware MSR, unless > > the WRMSR happens to be emulated. > > > > I really don't want to broadly support redirecting any MSR, because to truly go > > down that path we'd need to deal with x2APIC, EFER, and other MSRs that have > > special treatment and meaning. > > > > While KVM's stance is usually that a misconfigured vCPU model is userspace's > > problem, in this case I don't see any value in letting userspace be stupid. It > > can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's > > a crazy use case I'm overlooking, there's no sane reason for userspace to put the > > index in outside of the synthetic range (whereas defining seemingly nonsensical > > CPUID feature bits is useful for testing purposes, implementing support in > > userspace, etc). > > Right, I think we should do *both*. Blocking host writes solves the > issue of locking problems with the hypercall page setup. All it would > take for that issue to recur is for us (or Microsoft) to invent a new > MSR in the synthetic range which is also written on vCPU init/reset. > And then the sanity check on where the VMM puts the Xen MSR doesn't > save us. Ugh, indeed. MSRs are quite the conundrum. Userspace MSR filters have a similar problem, where it's impossible to know the semantics of future hardware MSRs, and so it's impossible to document which MSRs userspace is allowed to intercept :-/ Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. increment the index until an available index is found (with sanity checks and whatnot). > But yes, we should *also* do that sanity check. Ah, I'm a-ok with that. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 15:51 ` Sean Christopherson @ 2025-02-05 16:18 ` David Woodhouse 2025-02-05 17:15 ` David Woodhouse 2025-02-06 9:18 ` David Woodhouse 1 sibling, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-05 16:18 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On 5 February 2025 15:51:01 GMT, Sean Christopherson <seanjc@google.com> wrote: >On Wed, Feb 05, 2025, David Woodhouse wrote: >> >> Save/restore on the MSR makes no sense. It's a write-only MSR; writing >> to it has no effect *other* than populating the target page. In KVM we >> don't implement reading from it at all; I don't think Xen does either? > >Hah, that's another KVM bug, technically. KVM relies on the MSR not being handled >in order to generate the write-only semantics, but if the MSR index collides with >an MSR that KVM emulates, then the MSR would be readable. KVM supports Hyper-V's >HV_X64_MSR_TSC_INVARIANT_CONTROL (0x40000118), so just a few hundred more MSRs >until fireworks :-) Nah, I don't see that as a bug. If there's a conflict then the Xen hypercall MSR "steals" writes from the one it conflicts with, sure. But since it's a write-only MSR the conflicting one still works fine for reads. Which means that Xen can "conflict" with a read-only MSR and nobody cares; arguably there's no bug at all in that case. >> Those two happen in reverse chronological order, don't they? And in the >> lower one the comment tells you that hyperv_enabled() doesn't work yet. >> When the higher one is called later, it calls kvm_xen_init() *again* to >> put the MSR in the right place. >> >> It could be prettier, but I don't think it's broken, is it? > >Gah, -ENOCOFFEE. I'll take the criticism though; that code is distinctly non-obvious, even with that hint in the comment about hyperv_enabled() not being usable yet. ISTR we needed to do the Xen init early on, even before we knew precisely which MSR to use. And that's why we do it that way and then just call the function again later if we need to change the MSR. I'll see if that can be simplified, and at the very least I'll update the existing comment to explicitly state that the function will get called again later if needed. You shouldn't *need* coffee to understand the code. >> > Userspace breakage aside, disallowng host writes would fix the immediate issue, >> > and I think would mitigate all concerns with putting the host at risk. But it's >> > not enough to actually make an overlapping MSR index work. E.g. if the MSR is >> > passed through to the guest, the write will go through to the hardware MSR, unless >> > the WRMSR happens to be emulated. >> > >> > I really don't want to broadly support redirecting any MSR, because to truly go >> > down that path we'd need to deal with x2APIC, EFER, and other MSRs that have >> > special treatment and meaning. >> > >> > While KVM's stance is usually that a misconfigured vCPU model is userspace's >> > problem, in this case I don't see any value in letting userspace be stupid. It >> > can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's >> > a crazy use case I'm overlooking, there's no sane reason for userspace to put the >> > index in outside of the synthetic range (whereas defining seemingly nonsensical >> > CPUID feature bits is useful for testing purposes, implementing support in >> > userspace, etc). >> >> Right, I think we should do *both*. Blocking host writes solves the >> issue of locking problems with the hypercall page setup. All it would >> take for that issue to recur is for us (or Microsoft) to invent a new >> MSR in the synthetic range which is also written on vCPU init/reset. >> And then the sanity check on where the VMM puts the Xen MSR doesn't >> save us. > >Ugh, indeed. MSRs are quite the conundrum. Userspace MSR filters have a similar >problem, where it's impossible to know the semantics of future hardware MSRs, and >so it's impossible to document which MSRs userspace is allowed to intercept :-/ > >Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a >future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, >but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. >increment the index until an available index is found (with sanity checks and whatnot). Makes sense. I think that's a third separate patch, yes? >> But yes, we should *also* do that sanity check. > >Ah, I'm a-ok with that. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 16:18 ` David Woodhouse @ 2025-02-05 17:15 ` David Woodhouse 2025-02-05 19:20 ` Sean Christopherson 0 siblings, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-05 17:15 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 850 bytes --] On Wed, 2025-02-05 at 16:18 +0000, David Woodhouse wrote: > > > Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a > > future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, > > but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. > > increment the index until an available index is found (with sanity checks and whatnot). > > Makes sense. I think that's a third separate patch, yes? To be clear, I think I mean a third patch which further restricts kvm_xen_hvm_config() to disallow indices for which kvm_is_advertised_msr() returns true? We could roll that into your original patch instead, if you prefer. Q: Should kvm_is_advertised_msr() include the Xen hypercall MSR, if one is already configured? Life is easier if we answer 'no'... [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 17:15 ` David Woodhouse @ 2025-02-05 19:20 ` Sean Christopherson 2025-02-06 18:58 ` David Woodhouse 0 siblings, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2025-02-05 19:20 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On Wed, Feb 05, 2025, David Woodhouse wrote: > On Wed, 2025-02-05 at 16:18 +0000, David Woodhouse wrote: > > > > > Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a > > > future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, > > > but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. > > > increment the index until an available index is found (with sanity checks and whatnot). > > > > Makes sense. I think that's a third separate patch, yes? > > To be clear, I think I mean a third patch which further restricts > kvm_xen_hvm_config() to disallow indices for which > kvm_is_advertised_msr() returns true? > > We could roll that into your original patch instead, if you prefer. Nah, I like the idea of separate patch. > Q: Should kvm_is_advertised_msr() include the Xen hypercall MSR, if one > is already configured? Life is easier if we answer 'no'... No :-) The idea with kvm_is_advertised_msr() is to ignore accesses to MSRs that don't exist according the to vCPU model, but that KVM advertised to userspace (via KVM_GET_MSR_INDEX_LIST) and so may be saved/restored by a naive/unoptimized userspace. For the Xen MSR, KVM never advertises the MSR, and IIUC, KVM will never treat the MSR as non-existent because defining the MSR brings it into existence. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 19:20 ` Sean Christopherson @ 2025-02-06 18:58 ` David Woodhouse 2025-02-07 17:18 ` Sean Christopherson 0 siblings, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-06 18:58 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 1152 bytes --] On Wed, 2025-02-05 at 11:20 -0800, Sean Christopherson wrote: > On Wed, Feb 05, 2025, David Woodhouse wrote: > > On Wed, 2025-02-05 at 16:18 +0000, David Woodhouse wrote: > > > > > > > Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a > > > > future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, > > > > but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. > > > > increment the index until an available index is found (with sanity checks and whatnot). > > > > > > Makes sense. I think that's a third separate patch, yes? > > > > To be clear, I think I mean a third patch which further restricts > > kvm_xen_hvm_config() to disallow indices for which > > kvm_is_advertised_msr() returns true? > > > > We could roll that into your original patch instead, if you prefer. > > Nah, I like the idea of separate patch. Helpfully, kvm_is_advertised_msr() doesn't actually return true for MSR_IA32_XSS. Is that a bug? And kvm_vcpu_reset() attempts to set MSR_IA32_XSS even if the guest doesn't have X86_FEATURE_XSAVES. Is that a bug? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-06 18:58 ` David Woodhouse @ 2025-02-07 17:18 ` Sean Christopherson 0 siblings, 0 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-07 17:18 UTC (permalink / raw) To: David Woodhouse Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On Thu, Feb 06, 2025, David Woodhouse wrote: > On Wed, 2025-02-05 at 11:20 -0800, Sean Christopherson wrote: > > On Wed, Feb 05, 2025, David Woodhouse wrote: > > > On Wed, 2025-02-05 at 16:18 +0000, David Woodhouse wrote: > > > > > > > > > Oh! It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a > > > > > future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled, > > > > > but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g. > > > > > increment the index until an available index is found (with sanity checks and whatnot). > > > > > > > > Makes sense. I think that's a third separate patch, yes? > > > > > > To be clear, I think I mean a third patch which further restricts > > > kvm_xen_hvm_config() to disallow indices for which > > > kvm_is_advertised_msr() returns true? > > > > > > We could roll that into your original patch instead, if you prefer. > > > > Nah, I like the idea of separate patch. > > Helpfully, kvm_is_advertised_msr() doesn't actually return true for > MSR_IA32_XSS. Is that a bug? Technically, no. KVM doesn't support a non-zero XSS, yet, and so there's nothing for userspace to save/restore. But the word "yet"... > And kvm_vcpu_reset() attempts to set MSR_IA32_XSS even if the guest > doesn't have X86_FEATURE_XSAVES. Is that a bug? Ugh, sort of. Functionally, it's fine. Though it's quite the mess. The write can be straight up deleted, as the vCPU structure is zero-allocated and the CPUID side effects that using __kvm_set_msr() is intended to deal with are irrelevant because the CPUID array can't yet exist. The code came about due to an SDM bug and racing patches, and we missed that the __kvm_set_msr() would be pointless. The SDM had a bug that said XSS was cleared to '0' on INIT, and then KVM had a bug in its emulation of the buggy INIT logic where KVM open coded clearing ia32_xss, which led to stale CPUID information (XSTATE sizes weren't updated). While the patch[1] that became commit 05a9e065059e ("KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time") was in flight, Xiayoao reported the SDM bug and sent a fix[2]. I merged the two changes, but overlooked that at RESET, the CPUID array is guaranteed to be null/empty, and so calling into kvm_update_cpuid_runtime() is technically pointless. And because guest CPUID is empty, the vCPU can't possibly have X86_FEATURE_XSAVES, so gating the write on XSAVES would be even weirder and more confusing. I'm not sure what the best answer is. I'm leaning towards simply deleting the write. I'd love to have a better MSR framework in KVM, e.g. to document which MSRs are modified by INIT, but at this point I think writing '0' to an MSR during RESET (a.k.a. vCPU creation) does more harm than good. [1] https://lore.kernel.org/all/20220117082631.86143-1-likexu@tencent.com [2] https://lore.kernel.org/all/20220126034750.2495371-1-xiaoyao.li@intel.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-05 15:51 ` Sean Christopherson 2025-02-05 16:18 ` David Woodhouse @ 2025-02-06 9:18 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 9:18 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Paul Durrant, kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 6247 bytes --] On Wed, 2025-02-05 at 07:51 -0800, Sean Christopherson wrote: > > > Those two happen in reverse chronological order, don't they? And in the > > lower one the comment tells you that hyperv_enabled() doesn't work yet. > > When the higher one is called later, it calls kvm_xen_init() *again* to > > put the MSR in the right place. > > > > It could be prettier, but I don't think it's broken, is it? > > Gah, -ENOCOFFEE. I trust this version would require less coffee to parse... From a90c2df0bd9589609085dd42f94b61de1bf48eb7 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw@amazon.co.uk> Date: Thu, 6 Feb 2025 08:52:52 +0000 Subject: [PATCH] i386/xen: Move KVM_XEN_HVM_CONFIG ioctl to kvm_xen_init_vcpu() At the time kvm_xen_init() is called, hyperv_enabled() doesn't yet work, so the correct MSR index to use for the hypercall page isn't known. Rather than setting it to the default and then shifting it later for the Hyper-V case with a confusing second call to kvm_init_xen(), just do it once in kvm_xen_init_vcpu(). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- target/i386/kvm/kvm.c | 16 +++++--------- target/i386/kvm/xen-emu.c | 44 ++++++++++++++++++++------------------- target/i386/kvm/xen-emu.h | 4 ++-- 3 files changed, 30 insertions(+), 34 deletions(-) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 6c749d4ee8..b4deec6255 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2129,6 +2129,8 @@ int kvm_arch_init_vcpu(CPUState *cs) if (cs->kvm_state->xen_version) { #ifdef CONFIG_XEN_EMU struct kvm_cpuid_entry2 *xen_max_leaf; + uint32_t hypercall_msr = + hyperv_enabled(cpu) ? XEN_HYPERCALL_MSR_HYPERV : XEN_HYPERCALL_MSR; memcpy(signature, "XenVMMXenVMM", 12); @@ -2150,13 +2152,7 @@ int kvm_arch_init_vcpu(CPUState *cs) c->function = kvm_base + XEN_CPUID_HVM_MSR; /* Number of hypercall-transfer pages */ c->eax = 1; - /* Hypercall MSR base address */ - if (hyperv_enabled(cpu)) { - c->ebx = XEN_HYPERCALL_MSR_HYPERV; - kvm_xen_init(cs->kvm_state, c->ebx); - } else { - c->ebx = XEN_HYPERCALL_MSR; - } + c->ebx = hypercall_msr; c->ecx = 0; c->edx = 0; @@ -2194,7 +2190,7 @@ int kvm_arch_init_vcpu(CPUState *cs) } } - r = kvm_xen_init_vcpu(cs); + r = kvm_xen_init_vcpu(cs, hypercall_msr); if (r) { return r; } @@ -3245,9 +3241,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s) error_report("kvm: Xen support only available in PC machine"); return -ENOTSUP; } - /* hyperv_enabled() doesn't work yet. */ - uint32_t msr = XEN_HYPERCALL_MSR; - ret = kvm_xen_init(s, msr); + ret = kvm_xen_init(s); if (ret < 0) { return ret; } diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c index e81a245881..1144a6efcd 100644 --- a/target/i386/kvm/xen-emu.c +++ b/target/i386/kvm/xen-emu.c @@ -108,15 +108,11 @@ static inline int kvm_copy_to_gva(CPUState *cs, uint64_t gva, void *buf, return kvm_gva_rw(cs, gva, buf, sz, true); } -int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) +int kvm_xen_init(KVMState *s) { const int required_caps = KVM_XEN_HVM_CONFIG_HYPERCALL_MSR | KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL | KVM_XEN_HVM_CONFIG_SHARED_INFO; - struct kvm_xen_hvm_config cfg = { - .msr = hypercall_msr, - .flags = KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL, - }; - int xen_caps, ret; + int xen_caps; xen_caps = kvm_check_extension(s, KVM_CAP_XEN_HVM); if (required_caps & ~xen_caps) { @@ -130,20 +126,6 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) .u.xen_version = s->xen_version, }; (void)kvm_vm_ioctl(s, KVM_XEN_HVM_SET_ATTR, &ha); - - cfg.flags |= KVM_XEN_HVM_CONFIG_EVTCHN_SEND; - } - - ret = kvm_vm_ioctl(s, KVM_XEN_HVM_CONFIG, &cfg); - if (ret < 0) { - error_report("kvm: Failed to enable Xen HVM support: %s", - strerror(-ret)); - return ret; - } - - /* If called a second time, don't repeat the rest of the setup. */ - if (s->xen_caps) { - return 0; } /* @@ -185,10 +167,14 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr) return 0; } -int kvm_xen_init_vcpu(CPUState *cs) +int kvm_xen_init_vcpu(CPUState *cs, uint32_t hypercall_msr) { X86CPU *cpu = X86_CPU(cs); CPUX86State *env = &cpu->env; + struct kvm_xen_hvm_config cfg = { + .msr = hypercall_msr, + .flags = KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL, + }; int err; /* @@ -210,6 +196,22 @@ int kvm_xen_init_vcpu(CPUState *cs) strerror(-err)); return err; } + + cfg.flags |= KVM_XEN_HVM_CONFIG_EVTCHN_SEND; + } + + /* + * This is a per-KVM setting, but hyperv_enabled() can't be used + * when kvm_xen_init() is called from kvm_arch_init(), so do it + * when the BSP is initialized. + */ + if (cs->cpu_index == 0) { + err = kvm_vm_ioctl(cs->kvm_state, KVM_XEN_HVM_CONFIG, &cfg); + if (err) { + error_report("kvm: Failed to enable Xen HVM support: %s", + strerror(-err)); + return err; + } } env->xen_vcpu_info_gpa = INVALID_GPA; diff --git a/target/i386/kvm/xen-emu.h b/target/i386/kvm/xen-emu.h index fe85e0b195..7a7c72eee5 100644 --- a/target/i386/kvm/xen-emu.h +++ b/target/i386/kvm/xen-emu.h @@ -23,8 +23,8 @@ #define XEN_VERSION(maj, min) ((maj) << 16 | (min)) -int kvm_xen_init(KVMState *s, uint32_t hypercall_msr); -int kvm_xen_init_vcpu(CPUState *cs); +int kvm_xen_init(KVMState *s); +int kvm_xen_init_vcpu(CPUState *cs, uint32_t hypercall_msr); int kvm_xen_handle_exit(X86CPU *cpu, struct kvm_xen_exit *exit); int kvm_put_xen_state(CPUState *cs); int kvm_get_xen_state(CPUState *cs); -- 2.48.1 [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson 2025-02-03 9:09 ` Paul Durrant 2025-02-05 9:27 ` David Woodhouse @ 2025-02-06 16:51 ` David Woodhouse 2 siblings, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 16:51 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 2603 bytes --] On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > Reject userspace attempts to set the Xen hypercall page MSR to an index > outside of the "standard" virtualization range [0x40000000, 0x4fffffff], > as KVM is not equipped to handle collisions with real MSRs, e.g. KVM > doesn't update MSR interception, conflicts with VMCS/VMCB fields, special > case writes in KVM, etc. > > Allowing userspace to redirect any MSR write can also be used to attack > the kernel, as kvm_xen_write_hypercall_page() takes multiple locks and > writes to guest memory. E.g. if userspace sets the MSR to MSR_IA32_XSS, > KVM's write to MSR_IA32_XSS during vCPU creation will trigger an SRCU > violation due to writing guest memory: > > ============================= > WARNING: suspicious RCU usage > 6.13.0-rc3 > ----------------------------- > include/linux/kvm_host.h:1046 suspicious rcu_dereference_check() usage! > > stack backtrace: > CPU: 6 UID: 1000 PID: 1101 Comm: repro Not tainted 6.13.0-rc3 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > Call Trace: > <TASK> > dump_stack_lvl+0x7f/0x90 > lockdep_rcu_suspicious+0x176/0x1c0 > kvm_vcpu_gfn_to_memslot+0x259/0x280 > kvm_vcpu_write_guest+0x3a/0xa0 > kvm_xen_write_hypercall_page+0x268/0x300 > kvm_set_msr_common+0xc44/0x1940 > vmx_set_msr+0x9db/0x1fc0 > kvm_vcpu_reset+0x857/0xb50 > kvm_arch_vcpu_create+0x37e/0x4d0 > kvm_vm_ioctl+0x669/0x2100 > __x64_sys_ioctl+0xc1/0xf0 > do_syscall_64+0xc5/0x210 > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > RIP: 0033:0x7feda371b539 > > While the MSR index isn't strictly ABI, i.e. can theoretically float to > any value, in practice no known VMM sets the MSR index to anything other > than 0x40000000 or 0x40000200. > > Reported-by: syzbot+cdeaeec70992eca2d920@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/679258d4.050a0220.2eae65.000a.GAE@google.com > Cc: Joao Martins <joao.m.martins@oracle.com> > Cc: Paul Durrant <paul@xen.org> > Cc: David Woodhouse <dwmw@amazon.co.uk> > Signed-off-by: Sean Christopherson <seanjc@google.com> With macros for the magic numbers as discussed (and a corresponding update to the documentation), and with the Reported-by: and Closes: tags dropped because they should move to the commit which makes the hypercall page only trigger for !host_initiated writes and resolves it in a more future-proof way for the general case, Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson @ 2025-02-01 1:13 ` Sean Christopherson 2025-02-03 9:09 ` Paul Durrant 2025-02-06 16:28 ` David Woodhouse 2025-02-01 1:13 ` [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes Sean Christopherson ` (3 subsequent siblings) 5 siblings, 2 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Add a helper to detect writes to the Xen hypercall page MSR, and provide a stub for CONFIG_KVM_XEN=n to optimize out the check for kernels built without Xen support. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.h | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b2d9a16fd4d3..f13d9d3f7c60 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3733,7 +3733,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u32 msr = msr_info->index; u64 data = msr_info->data; - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) + if (kvm_xen_is_hypercall_page_msr(vcpu->kvm, msr)) return kvm_xen_write_hypercall_page(vcpu, data); switch (msr) { diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h index f5841d9000ae..e92e06926f76 100644 --- a/arch/x86/kvm/xen.h +++ b/arch/x86/kvm/xen.h @@ -56,6 +56,11 @@ static inline bool kvm_xen_msr_enabled(struct kvm *kvm) kvm->arch.xen_hvm_config.msr; } +static inline bool kvm_xen_is_hypercall_page_msr(struct kvm *kvm, u32 msr) +{ + return msr && msr == kvm->arch.xen_hvm_config.msr; +} + static inline bool kvm_xen_hypercall_enabled(struct kvm *kvm) { return static_branch_unlikely(&kvm_xen_enabled.key) && @@ -124,6 +129,11 @@ static inline bool kvm_xen_msr_enabled(struct kvm *kvm) return false; } +static inline bool kvm_xen_is_hypercall_page_msr(struct kvm *kvm, u32 msr) +{ + return false; +} + static inline bool kvm_xen_hypercall_enabled(struct kvm *kvm) { return false; -- 2.48.1.362.g079036d154-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR 2025-02-01 1:13 ` [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR Sean Christopherson @ 2025-02-03 9:09 ` Paul Durrant 2025-02-06 16:28 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: Paul Durrant @ 2025-02-03 9:09 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse On 01/02/2025 01:13, Sean Christopherson wrote: > Add a helper to detect writes to the Xen hypercall page MSR, and provide a > stub for CONFIG_KVM_XEN=n to optimize out the check for kernels built > without Xen support. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 2 +- > arch/x86/kvm/xen.h | 10 ++++++++++ > 2 files changed, 11 insertions(+), 1 deletion(-) > Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR 2025-02-01 1:13 ` [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR Sean Christopherson 2025-02-03 9:09 ` Paul Durrant @ 2025-02-06 16:28 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 16:28 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 352 bytes --] On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > Add a helper to detect writes to the Xen hypercall page MSR, and provide a > stub for CONFIG_KVM_XEN=n to optimize out the check for kernels built > without Xen support. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson 2025-02-01 1:13 ` [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR Sean Christopherson @ 2025-02-01 1:13 ` Sean Christopherson 2025-02-03 9:15 ` Paul Durrant 2025-02-06 16:29 ` David Woodhouse 2025-02-01 1:13 ` [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y Sean Christopherson ` (2 subsequent siblings) 5 siblings, 2 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Query kvm_xen_enabled when detecting writes to the Xen hypercall page MSR so that the check is optimized away in the likely scenario that Xen isn't enabled for the VM. Deliberately open code the check instead of using kvm_xen_msr_enabled() in order to avoid a double load of xen_hvm_config.msr (which is admittedly rather pointless given the widespread lack of READ_ONCE() usage on the plethora of vCPU-scoped accesses to kvm->arch.xen state). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/xen.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h index e92e06926f76..1e3a913dfb94 100644 --- a/arch/x86/kvm/xen.h +++ b/arch/x86/kvm/xen.h @@ -58,6 +58,9 @@ static inline bool kvm_xen_msr_enabled(struct kvm *kvm) static inline bool kvm_xen_is_hypercall_page_msr(struct kvm *kvm, u32 msr) { + if (!static_branch_unlikely(&kvm_xen_enabled.key)) + return false; + return msr && msr == kvm->arch.xen_hvm_config.msr; } -- 2.48.1.362.g079036d154-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes 2025-02-01 1:13 ` [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes Sean Christopherson @ 2025-02-03 9:15 ` Paul Durrant 2025-02-06 16:29 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: Paul Durrant @ 2025-02-03 9:15 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse On 01/02/2025 01:13, Sean Christopherson wrote: > Query kvm_xen_enabled when detecting writes to the Xen hypercall page MSR > so that the check is optimized away in the likely scenario that Xen isn't > enabled for the VM. > > Deliberately open code the check instead of using kvm_xen_msr_enabled() in > order to avoid a double load of xen_hvm_config.msr (which is admittedly > rather pointless given the widespread lack of READ_ONCE() usage on the > plethora of vCPU-scoped accesses to kvm->arch.xen state). > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/xen.h | 3 +++ > 1 file changed, 3 insertions(+) > Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes 2025-02-01 1:13 ` [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes Sean Christopherson 2025-02-03 9:15 ` Paul Durrant @ 2025-02-06 16:29 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 16:29 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 686 bytes --] On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > Query kvm_xen_enabled when detecting writes to the Xen hypercall page MSR > so that the check is optimized away in the likely scenario that Xen isn't > enabled for the VM. > > Deliberately open code the check instead of using kvm_xen_msr_enabled() in > order to avoid a double load of xen_hvm_config.msr (which is admittedly > rather pointless given the widespread lack of READ_ONCE() usage on the > plethora of vCPU-scoped accesses to kvm->arch.xen state). > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson ` (2 preceding siblings ...) 2025-02-01 1:13 ` [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes Sean Christopherson @ 2025-02-01 1:13 ` Sean Christopherson 2025-02-03 9:19 ` Paul Durrant 2025-02-06 16:30 ` David Woodhouse 2025-02-01 1:14 ` [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen Sean Christopherson 2025-02-06 19:14 ` [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR David Woodhouse 5 siblings, 2 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Now that all references to kvm_vcpu_arch.xen_hvm_config are wrapped with CONFIG_KVM_XEN #ifdefs, bury the field itself behind CONFIG_KVM_XEN=y. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5193c3dfbce1..7f9e00004db2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1402,8 +1402,6 @@ struct kvm_arch { struct delayed_work kvmclock_update_work; struct delayed_work kvmclock_sync_work; - struct kvm_xen_hvm_config xen_hvm_config; - /* reads protected by irq_srcu, writes by irq_lock */ struct hlist_head mask_notifier_list; @@ -1413,6 +1411,7 @@ struct kvm_arch { #ifdef CONFIG_KVM_XEN struct kvm_xen xen; + struct kvm_xen_hvm_config xen_hvm_config; #endif bool backwards_tsc_observed; -- 2.48.1.362.g079036d154-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y 2025-02-01 1:13 ` [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y Sean Christopherson @ 2025-02-03 9:19 ` Paul Durrant 2025-02-06 16:30 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: Paul Durrant @ 2025-02-03 9:19 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse On 01/02/2025 01:13, Sean Christopherson wrote: > Now that all references to kvm_vcpu_arch.xen_hvm_config are wrapped with > CONFIG_KVM_XEN #ifdefs, bury the field itself behind CONFIG_KVM_XEN=y. > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/include/asm/kvm_host.h | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y 2025-02-01 1:13 ` [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y Sean Christopherson 2025-02-03 9:19 ` Paul Durrant @ 2025-02-06 16:30 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 16:30 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 367 bytes --] On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote: > Now that all references to kvm_vcpu_arch.xen_hvm_config are wrapped with > CONFIG_KVM_XEN #ifdefs, bury the field itself behind CONFIG_KVM_XEN=y. > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson ` (3 preceding siblings ...) 2025-02-01 1:13 ` [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y Sean Christopherson @ 2025-02-01 1:14 ` Sean Christopherson 2025-02-03 9:21 ` Paul Durrant 2025-02-06 16:32 ` David Woodhouse 2025-02-06 19:14 ` [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR David Woodhouse 5 siblings, 2 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-01 1:14 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse Now that all KVM usage of the Xen HVM config information is buried behind CONFIG_KVM_XEN=y, move the per-VM kvm_xen_hvm_config field out of kvm_arch and into kvm_xen. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.c | 20 ++++++++++---------- arch/x86/kvm/xen.h | 6 +++--- 4 files changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 7f9e00004db2..e9ebd6d6492c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1180,6 +1180,8 @@ struct kvm_xen { struct gfn_to_pfn_cache shinfo_cache; struct idr evtchn_ports; unsigned long poll_mask[BITS_TO_LONGS(KVM_MAX_VCPUS)]; + + struct kvm_xen_hvm_config hvm_config; }; #endif @@ -1411,7 +1413,6 @@ struct kvm_arch { #ifdef CONFIG_KVM_XEN struct kvm_xen xen; - struct kvm_xen_hvm_config xen_hvm_config; #endif bool backwards_tsc_observed; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f13d9d3f7c60..b03c67d53e5f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3188,7 +3188,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) * problems if they observe PVCLOCK_TSC_STABLE_BIT in the pvclock flags. */ bool xen_pvclock_tsc_unstable = - ka->xen_hvm_config.flags & KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE; + ka->xen.hvm_config.flags & KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE; #endif kernel_ns = 0; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 35ecafc410f0..142018b9cdd2 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -1280,10 +1280,10 @@ int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data) * Note, truncation is a non-issue as 'lm' is guaranteed to be * false for a 32-bit kernel, i.e. when hva_t is only 4 bytes. */ - hva_t blob_addr = lm ? kvm->arch.xen_hvm_config.blob_addr_64 - : kvm->arch.xen_hvm_config.blob_addr_32; - u8 blob_size = lm ? kvm->arch.xen_hvm_config.blob_size_64 - : kvm->arch.xen_hvm_config.blob_size_32; + hva_t blob_addr = lm ? kvm->arch.xen.hvm_config.blob_addr_64 + : kvm->arch.xen.hvm_config.blob_addr_32; + u8 blob_size = lm ? kvm->arch.xen.hvm_config.blob_size_64 + : kvm->arch.xen.hvm_config.blob_size_32; u8 *page; int ret; @@ -1334,13 +1334,13 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc) mutex_lock(&kvm->arch.xen.xen_lock); - if (xhc->msr && !kvm->arch.xen_hvm_config.msr) + if (xhc->msr && !kvm->arch.xen.hvm_config.msr) static_branch_inc(&kvm_xen_enabled.key); - else if (!xhc->msr && kvm->arch.xen_hvm_config.msr) + else if (!xhc->msr && kvm->arch.xen.hvm_config.msr) static_branch_slow_dec_deferred(&kvm_xen_enabled); - old_flags = kvm->arch.xen_hvm_config.flags; - memcpy(&kvm->arch.xen_hvm_config, xhc, sizeof(*xhc)); + old_flags = kvm->arch.xen.hvm_config.flags; + memcpy(&kvm->arch.xen.hvm_config, xhc, sizeof(*xhc)); mutex_unlock(&kvm->arch.xen.xen_lock); @@ -1421,7 +1421,7 @@ static bool kvm_xen_schedop_poll(struct kvm_vcpu *vcpu, bool longmode, int i; if (!lapic_in_kernel(vcpu) || - !(vcpu->kvm->arch.xen_hvm_config.flags & KVM_XEN_HVM_CONFIG_EVTCHN_SEND)) + !(vcpu->kvm->arch.xen.hvm_config.flags & KVM_XEN_HVM_CONFIG_EVTCHN_SEND)) return false; if (IS_ENABLED(CONFIG_64BIT) && !longmode) { @@ -2299,6 +2299,6 @@ void kvm_xen_destroy_vm(struct kvm *kvm) } idr_destroy(&kvm->arch.xen.evtchn_ports); - if (kvm->arch.xen_hvm_config.msr) + if (kvm->arch.xen.hvm_config.msr) static_branch_slow_dec_deferred(&kvm_xen_enabled); } diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h index 1e3a913dfb94..d191103d8163 100644 --- a/arch/x86/kvm/xen.h +++ b/arch/x86/kvm/xen.h @@ -53,7 +53,7 @@ static inline void kvm_xen_sw_enable_lapic(struct kvm_vcpu *vcpu) static inline bool kvm_xen_msr_enabled(struct kvm *kvm) { return static_branch_unlikely(&kvm_xen_enabled.key) && - kvm->arch.xen_hvm_config.msr; + kvm->arch.xen.hvm_config.msr; } static inline bool kvm_xen_is_hypercall_page_msr(struct kvm *kvm, u32 msr) @@ -61,13 +61,13 @@ static inline bool kvm_xen_is_hypercall_page_msr(struct kvm *kvm, u32 msr) if (!static_branch_unlikely(&kvm_xen_enabled.key)) return false; - return msr && msr == kvm->arch.xen_hvm_config.msr; + return msr && msr == kvm->arch.xen.hvm_config.msr; } static inline bool kvm_xen_hypercall_enabled(struct kvm *kvm) { return static_branch_unlikely(&kvm_xen_enabled.key) && - (kvm->arch.xen_hvm_config.flags & + (kvm->arch.xen.hvm_config.flags & KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL); } -- 2.48.1.362.g079036d154-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen 2025-02-01 1:14 ` [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen Sean Christopherson @ 2025-02-03 9:21 ` Paul Durrant 2025-02-06 16:32 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: Paul Durrant @ 2025-02-03 9:21 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins, David Woodhouse On 01/02/2025 01:14, Sean Christopherson wrote: > Now that all KVM usage of the Xen HVM config information is buried behind > CONFIG_KVM_XEN=y, move the per-VM kvm_xen_hvm_config field out of kvm_arch > and into kvm_xen. > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/include/asm/kvm_host.h | 3 ++- > arch/x86/kvm/x86.c | 2 +- > arch/x86/kvm/xen.c | 20 ++++++++++---------- > arch/x86/kvm/xen.h | 6 +++--- > 4 files changed, 16 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 7f9e00004db2..e9ebd6d6492c 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1180,6 +1180,8 @@ struct kvm_xen { > struct gfn_to_pfn_cache shinfo_cache; > struct idr evtchn_ports; > unsigned long poll_mask[BITS_TO_LONGS(KVM_MAX_VCPUS)]; > + nit: extraneous blank line? > + struct kvm_xen_hvm_config hvm_config; > }; > #endif > Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen 2025-02-01 1:14 ` [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen Sean Christopherson 2025-02-03 9:21 ` Paul Durrant @ 2025-02-06 16:32 ` David Woodhouse 1 sibling, 0 replies; 28+ messages in thread From: David Woodhouse @ 2025-02-06 16:32 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 530 bytes --] On Fri, 2025-01-31 at 17:14 -0800, Sean Christopherson wrote: > Now that all KVM usage of the Xen HVM config information is buried behind > CONFIG_KVM_XEN=y, move the per-VM kvm_xen_hvm_config field out of kvm_arch > and into kvm_xen. > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Not quite sure why this is a separate patch instead of being part of the previous one (which also applied to patches 2 and 3), but OK. Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson ` (4 preceding siblings ...) 2025-02-01 1:14 ` [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen Sean Christopherson @ 2025-02-06 19:14 ` David Woodhouse 2025-02-15 0:50 ` Sean Christopherson 5 siblings, 1 reply; 28+ messages in thread From: David Woodhouse @ 2025-02-06 19:14 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins [-- Attachment #1: Type: text/plain, Size: 2729 bytes --] From: David Woodhouse <dwmw@amazon.co.uk> The Xen hypercall page MSR is write-only. When the guest writes an address to the MSR, the hypervisor populates the referenced page with hypercall functions. There is no reason for the host ever to write to the MSR, and it isn't even readable. Allowing host writes to trigger the hypercall page allows userspace to attack the kernel, as kvm_xen_write_hypercall_page() takes multiple locks and writes to guest memory. E.g. if userspace sets the MSR to MSR_IA32_XSS, KVM's write to MSR_IA32_XSS during vCPU creation will trigger an SRCU violation due to writing guest memory: ============================= WARNING: suspicious RCU usage 6.13.0-rc3 ----------------------------- include/linux/kvm_host.h:1046 suspicious rcu_dereference_check() usage! stack backtrace: CPU: 6 UID: 1000 PID: 1101 Comm: repro Not tainted 6.13.0-rc3 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x7f/0x90 lockdep_rcu_suspicious+0x176/0x1c0 kvm_vcpu_gfn_to_memslot+0x259/0x280 kvm_vcpu_write_guest+0x3a/0xa0 kvm_xen_write_hypercall_page+0x268/0x300 kvm_set_msr_common+0xc44/0x1940 vmx_set_msr+0x9db/0x1fc0 kvm_vcpu_reset+0x857/0xb50 kvm_arch_vcpu_create+0x37e/0x4d0 kvm_vm_ioctl+0x669/0x2100 __x64_sys_ioctl+0xc1/0xf0 do_syscall_64+0xc5/0x210 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7feda371b539 While the MSR index isn't strictly ABI, i.e. can theoretically float to any value, in practice no known VMM sets the MSR index to anything other than 0x40000000 or 0x40000200. Reported-by: syzbot+cdeaeec70992eca2d920@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/679258d4.050a0220.2eae65.000a.GAE@google.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- arch/x86/kvm/x86.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6d4a6734b2d6..f1ecba788d0a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u32 msr = msr_info->index; u64 data = msr_info->data; - if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr) + /* + * Do not allow host-initiated writes to trigger the Xen hypercall + * page setup; it could incur locking paths which are not expected + * if userspace sets the MSR in an unusual location. + */ + if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr && + !msr_info->host_initiated) return kvm_xen_write_hypercall_page(vcpu, data); switch (msr) { -- 2.48.1 [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5069 bytes --] ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR 2025-02-06 19:14 ` [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR David Woodhouse @ 2025-02-15 0:50 ` Sean Christopherson 0 siblings, 0 replies; 28+ messages in thread From: Sean Christopherson @ 2025-02-15 0:50 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini, Paul Durrant, David Woodhouse Cc: kvm, linux-kernel, syzbot+cdeaeec70992eca2d920, Joao Martins On Thu, 06 Feb 2025 19:14:19 +0000, David Woodhouse wrote: > The Xen hypercall page MSR is write-only. When the guest writes an address > to the MSR, the hypervisor populates the referenced page with hypercall > functions. > > There is no reason for the host ever to write to the MSR, and it isn't > even readable. > > [...] Applied to kvm-x86 xen, thanks! I'll post v2 of my series on top. [1/1] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR https://github.com/kvm-x86/linux/commit/3617c0ee7dec -- https://github.com/kvm-x86/linux/tree/next ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-02-15 0:56 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-01 1:13 [PATCH 0/5] KVM: x86/xen: Restrict hypercall MSR index Sean Christopherson 2025-02-01 1:13 ` [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range Sean Christopherson 2025-02-03 9:09 ` Paul Durrant 2025-02-05 9:27 ` David Woodhouse 2025-02-05 15:06 ` Sean Christopherson 2025-02-05 15:26 ` David Woodhouse 2025-02-05 15:51 ` Sean Christopherson 2025-02-05 16:18 ` David Woodhouse 2025-02-05 17:15 ` David Woodhouse 2025-02-05 19:20 ` Sean Christopherson 2025-02-06 18:58 ` David Woodhouse 2025-02-07 17:18 ` Sean Christopherson 2025-02-06 9:18 ` David Woodhouse 2025-02-06 16:51 ` David Woodhouse 2025-02-01 1:13 ` [PATCH 2/5] KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSR Sean Christopherson 2025-02-03 9:09 ` Paul Durrant 2025-02-06 16:28 ` David Woodhouse 2025-02-01 1:13 ` [PATCH 3/5] KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writes Sean Christopherson 2025-02-03 9:15 ` Paul Durrant 2025-02-06 16:29 ` David Woodhouse 2025-02-01 1:13 ` [PATCH 4/5] KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=y Sean Christopherson 2025-02-03 9:19 ` Paul Durrant 2025-02-06 16:30 ` David Woodhouse 2025-02-01 1:14 ` [PATCH 5/5] KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xen Sean Christopherson 2025-02-03 9:21 ` Paul Durrant 2025-02-06 16:32 ` David Woodhouse 2025-02-06 19:14 ` [PATCH] KVM: x86/xen: Only write Xen hypercall page for guest writes to MSR David Woodhouse 2025-02-15 0:50 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).