* [PATCH 1/3] KVM: x86: Allow HWCR.McStatusWrEn to be cleared once set
@ 2023-09-22 16:42 Jim Mattson
2023-09-22 16:42 ` [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] Jim Mattson
2023-09-22 16:42 ` [PATCH 3/3] KVM: selftests: Test behavior of HWCR Jim Mattson
0 siblings, 2 replies; 10+ messages in thread
From: Jim Mattson @ 2023-09-22 16:42 UTC (permalink / raw)
To: kvm, 'Sean Christopherson ', 'Paolo Bonzini '; +Cc: Jim Mattson
When HWCR is set to 0, store 0 in vcpu->arch.msr_hwcr.
Fixes: 191c8137a939 ("x86/kvm: Implement HWCR support")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
arch/x86/kvm/x86.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6c9c81e82e65..3421ed7fcee0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3701,12 +3701,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
data &= ~(u64)0x8; /* ignore TLB cache disable */
/* Handle McStatusWrEn */
- if (data == BIT_ULL(18)) {
- vcpu->arch.msr_hwcr = data;
- } else if (data != 0) {
+ if (data & ~BIT_ULL(18)) {
kvm_pr_unimpl_wrmsr(vcpu, msr, data);
return 1;
}
+ vcpu->arch.msr_hwcr = data;
break;
case MSR_FAM10H_MMIO_CONF_BASE:
if (data != 0) {
--
2.42.0.515.g380fc7ccd1-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 16:42 [PATCH 1/3] KVM: x86: Allow HWCR.McStatusWrEn to be cleared once set Jim Mattson @ 2023-09-22 16:42 ` Jim Mattson 2023-09-22 17:21 ` Sean Christopherson 2023-09-22 16:42 ` [PATCH 3/3] KVM: selftests: Test behavior of HWCR Jim Mattson 1 sibling, 1 reply; 10+ messages in thread From: Jim Mattson @ 2023-09-22 16:42 UTC (permalink / raw) To: kvm, 'Sean Christopherson ', 'Paolo Bonzini '; +Cc: Jim Mattson On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be set. If it isn't set, they complain: [Firmware Bug]: TSC doesn't count with P0 frequency! Eliminate this complaint by setting the bit on virtual processors for which Linux guests expect it to be set. Note that this bit is read-only on said processors. Signed-off-by: Jim Mattson <jmattson@google.com> --- arch/x86/kvm/cpuid.c | 10 ++++++++++ arch/x86/kvm/x86.c | 7 +++++++ 2 files changed, 17 insertions(+) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0544e30b4946..2d7dcd13dcc3 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -373,6 +373,16 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu); + /* + * HWCR.TscFreqSel[bit 24] has a reset value of 1 on some processors. + */ + if (guest_cpuid_is_amd_or_hygon(vcpu) && + guest_cpuid_has(vcpu, X86_FEATURE_CONSTANT_TSC) && + (guest_cpuid_family(vcpu) > 0x10 || + (guest_cpuid_family(vcpu) == 0x10 && + guest_cpuid_model(vcpu) >= 2))) + vcpu->arch.msr_hwcr |= BIT(24); + kvm_pmu_refresh(vcpu); vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(guest_cpuid_has, vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3421ed7fcee0..cb02a7c2938b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) data &= ~(u64)0x40; /* ignore flush filter disable */ data &= ~(u64)0x100; /* ignore ignne emulation enable */ data &= ~(u64)0x8; /* ignore TLB cache disable */ + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ /* Handle McStatusWrEn */ if (data & ~BIT_ULL(18)) { kvm_pr_unimpl_wrmsr(vcpu, msr, data); return 1; } + + /* + * When set, TscFreqSel is read-only. Attempts to + * clear it are ignored. + */ + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); vcpu->arch.msr_hwcr = data; break; case MSR_FAM10H_MMIO_CONF_BASE: -- 2.42.0.515.g380fc7ccd1-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 16:42 ` [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] Jim Mattson @ 2023-09-22 17:21 ` Sean Christopherson 2023-09-22 17:48 ` Jim Mattson 0 siblings, 1 reply; 10+ messages in thread From: Sean Christopherson @ 2023-09-22 17:21 UTC (permalink / raw) To: Jim Mattson; +Cc: kvm, 'Paolo Bonzini ' On Fri, Sep 22, 2023, Jim Mattson wrote: > On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be > set. If it isn't set, they complain: > [Firmware Bug]: TSC doesn't count with P0 frequency! > > Eliminate this complaint by setting the bit on virtual processors for > which Linux guests expect it to be set. > > Note that this bit is read-only on said processors. > > Signed-off-by: Jim Mattson <jmattson@google.com> > --- > arch/x86/kvm/cpuid.c | 10 ++++++++++ > arch/x86/kvm/x86.c | 7 +++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index 0544e30b4946..2d7dcd13dcc3 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -373,6 +373,16 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); > vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu); > > + /* > + * HWCR.TscFreqSel[bit 24] has a reset value of 1 on some processors. > + */ > + if (guest_cpuid_is_amd_or_hygon(vcpu) && > + guest_cpuid_has(vcpu, X86_FEATURE_CONSTANT_TSC) && > + (guest_cpuid_family(vcpu) > 0x10 || > + (guest_cpuid_family(vcpu) == 0x10 && > + guest_cpuid_model(vcpu) >= 2))) > + vcpu->arch.msr_hwcr |= BIT(24); Oh hell no. It's bad enough that KVM _allows_ setting uarch specific bits, but actively setting bits is a step too far. IMO, we should delete the offending kernel code. I don't see how it provides any value these days. And *if* we want to change something in KVM so that we stop getting coustomer complaints about a useless bit, just let userspace stuff the bit. I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand that bits in HWCR that KVM allows to be set are architecturally defined. It's totally fine if the value of bit 24 is uarch specific, but the behavior needs to be something that won't change from processor to processor. > kvm_pmu_refresh(vcpu); > vcpu->arch.cr4_guest_rsvd_bits = > __cr4_reserved_bits(guest_cpuid_has, vcpu); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 3421ed7fcee0..cb02a7c2938b 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > data &= ~(u64)0x40; /* ignore flush filter disable */ > data &= ~(u64)0x100; /* ignore ignne emulation enable */ > data &= ~(u64)0x8; /* ignore TLB cache disable */ > + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ > > /* Handle McStatusWrEn */ > if (data & ~BIT_ULL(18)) { > kvm_pr_unimpl_wrmsr(vcpu, msr, data); > return 1; > } > + > + /* > + * When set, TscFreqSel is read-only. Attempts to > + * clear it are ignored. > + */ > + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); The bit is read-only from the guest, but KVM needs to let userspace clear the bit. > vcpu->arch.msr_hwcr = data; > break; > case MSR_FAM10H_MMIO_CONF_BASE: > -- > 2.42.0.515.g380fc7ccd1-goog > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 17:21 ` Sean Christopherson @ 2023-09-22 17:48 ` Jim Mattson 2023-09-22 18:15 ` Sean Christopherson 0 siblings, 1 reply; 10+ messages in thread From: Jim Mattson @ 2023-09-22 17:48 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023 at 10:21 AM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be > > set. If it isn't set, they complain: > > [Firmware Bug]: TSC doesn't count with P0 frequency! > > > > Eliminate this complaint by setting the bit on virtual processors for > > which Linux guests expect it to be set. > > > > Note that this bit is read-only on said processors. > > > > Signed-off-by: Jim Mattson <jmattson@google.com> > > --- > > arch/x86/kvm/cpuid.c | 10 ++++++++++ > > arch/x86/kvm/x86.c | 7 +++++++ > > 2 files changed, 17 insertions(+) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index 0544e30b4946..2d7dcd13dcc3 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -373,6 +373,16 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > > vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); > > vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu); > > > > + /* > > + * HWCR.TscFreqSel[bit 24] has a reset value of 1 on some processors. > > + */ > > + if (guest_cpuid_is_amd_or_hygon(vcpu) && > > + guest_cpuid_has(vcpu, X86_FEATURE_CONSTANT_TSC) && > > + (guest_cpuid_family(vcpu) > 0x10 || > > + (guest_cpuid_family(vcpu) == 0x10 && > > + guest_cpuid_model(vcpu) >= 2))) > > + vcpu->arch.msr_hwcr |= BIT(24); > > Oh hell no. It's bad enough that KVM _allows_ setting uarch specific bits, but > actively setting bits is a step too far. The bit should be set on power on. From the PPR for AMD Family 17h Model 01h, Revision B1 Processors, > TscFreqSel: TSC frequency select. Read-only. Reset: 1. Leaving it clear is a violation of the CPU vendor's hardware specification. > IMO, we should delete the offending kernel code. I don't see how it provides any > value these days. Sure, but that doesn't help legacy guests. > And *if* we want to change something in KVM so that we stop getting coustomer > complaints about a useless bit, just let userspace stuff the bit. We want to make customers happy. That should not even be a question. > I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand > that bits in HWCR that KVM allows to be set are architecturally defined. It's > totally fine if the value of bit 24 is uarch specific, but the behavior needs to > be something that won't change from processor to processor. > > > kvm_pmu_refresh(vcpu); > > vcpu->arch.cr4_guest_rsvd_bits = > > __cr4_reserved_bits(guest_cpuid_has, vcpu); > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 3421ed7fcee0..cb02a7c2938b 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > data &= ~(u64)0x40; /* ignore flush filter disable */ > > data &= ~(u64)0x100; /* ignore ignne emulation enable */ > > data &= ~(u64)0x8; /* ignore TLB cache disable */ > > + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ > > > > /* Handle McStatusWrEn */ > > if (data & ~BIT_ULL(18)) { > > kvm_pr_unimpl_wrmsr(vcpu, msr, data); > > return 1; > > } > > + > > + /* > > + * When set, TscFreqSel is read-only. Attempts to > > + * clear it are ignored. > > + */ > > + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); > > > The bit is read-only from the guest, but KVM needs to let userspace clear the > bit. Why? We don't let userspace clear bit 1 of EFLAGS, which is also a "reads as one" bit. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 17:48 ` Jim Mattson @ 2023-09-22 18:15 ` Sean Christopherson 2023-09-22 18:27 ` Jim Mattson 0 siblings, 1 reply; 10+ messages in thread From: Sean Christopherson @ 2023-09-22 18:15 UTC (permalink / raw) To: Jim Mattson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023, Jim Mattson wrote: > On Fri, Sep 22, 2023 at 10:21 AM Sean Christopherson <seanjc@google.com> wrote: > > IMO, we should delete the offending kernel code. I don't see how it provides any > > value these days. > > Sure, but that doesn't help legacy guests. Heh, IMO they don't need help, their owners just need to be placated ;-) > > And *if* we want to change something in KVM so that we stop getting coustomer > > complaints about a useless bit, just let userspace stuff the bit. > > We want to make customers happy. That should not even be a question. Can we really not tell them "this is a benign guest bug, ignore it"? > > I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand > > that bits in HWCR that KVM allows to be set are architecturally defined. It's > > totally fine if the value of bit 24 is uarch specific, but the behavior needs to > > be something that won't change from processor to processor. > > > > > kvm_pmu_refresh(vcpu); > > > vcpu->arch.cr4_guest_rsvd_bits = > > > __cr4_reserved_bits(guest_cpuid_has, vcpu); > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 3421ed7fcee0..cb02a7c2938b 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > data &= ~(u64)0x40; /* ignore flush filter disable */ > > > data &= ~(u64)0x100; /* ignore ignne emulation enable */ > > > data &= ~(u64)0x8; /* ignore TLB cache disable */ > > > + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ > > > > > > /* Handle McStatusWrEn */ > > > if (data & ~BIT_ULL(18)) { > > > kvm_pr_unimpl_wrmsr(vcpu, msr, data); > > > return 1; > > > } > > > + > > > + /* > > > + * When set, TscFreqSel is read-only. Attempts to > > > + * clear it are ignored. > > > + */ > > > + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); > > > > > > The bit is read-only from the guest, but KVM needs to let userspace clear the > > bit. > > Why? We don't let userspace clear bit 1 of EFLAGS, which is also a > "reads as one" bit. Because that's architectural behavior, not dependent on FMS, and KVM needs to write EFLAGS to have any hope of being useful, i.e. giving ownership of EFLAGS to userspace is not a realistic option. As proposed, if userspace sets CPUID to a magic FMS, and then changes the FMS to something else, kvm_vcpu_after_set_cpuid() will not clear the bit and KVM will end up wrongly enumerating the bit. I doubt userspace would ever do that, but it's at least possible. That could be fixed by actively clearing vcpu->arch.msr_hwcr for other FMS values, but then KVM would have to be 100% precise on the FMS matching, which would be a maintenance nightmare. In other words, userspace owns the vCPU model, and for good reasons. KVM needs to allow userspace to define a sane model, but with *very* few exceptions, KVM should not try to "help" userspace by stuffing bits. Pretty much everytime KVM tries to help, it causes problems. E.g. initializing perf_capabilities to kvm_caps.supported_perf_cap seems like a good thing, except it presents a bogus model if userspace decides to not enumerate a vPMU to the guest (Aaron was allegedly going to send a patch for this...). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 18:15 ` Sean Christopherson @ 2023-09-22 18:27 ` Jim Mattson 2023-09-22 19:40 ` Sean Christopherson 0 siblings, 1 reply; 10+ messages in thread From: Jim Mattson @ 2023-09-22 18:27 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023 at 11:15 AM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > On Fri, Sep 22, 2023 at 10:21 AM Sean Christopherson <seanjc@google.com> wrote: > > > IMO, we should delete the offending kernel code. I don't see how it provides any > > > value these days. > > > > Sure, but that doesn't help legacy guests. > > Heh, IMO they don't need help, their owners just need to be placated ;-) > > > > And *if* we want to change something in KVM so that we stop getting coustomer > > > complaints about a useless bit, just let userspace stuff the bit. > > > > We want to make customers happy. That should not even be a question. > > Can we really not tell them "this is a benign guest bug, ignore it"? What is the mechanism for doing that? > > > I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand > > > that bits in HWCR that KVM allows to be set are architecturally defined. It's > > > totally fine if the value of bit 24 is uarch specific, but the behavior needs to > > > be something that won't change from processor to processor. > > > > > > > kvm_pmu_refresh(vcpu); > > > > vcpu->arch.cr4_guest_rsvd_bits = > > > > __cr4_reserved_bits(guest_cpuid_has, vcpu); > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index 3421ed7fcee0..cb02a7c2938b 100644 > > > > --- a/arch/x86/kvm/x86.c > > > > +++ b/arch/x86/kvm/x86.c > > > > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > data &= ~(u64)0x40; /* ignore flush filter disable */ > > > > data &= ~(u64)0x100; /* ignore ignne emulation enable */ > > > > data &= ~(u64)0x8; /* ignore TLB cache disable */ > > > > + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ > > > > > > > > /* Handle McStatusWrEn */ > > > > if (data & ~BIT_ULL(18)) { > > > > kvm_pr_unimpl_wrmsr(vcpu, msr, data); > > > > return 1; > > > > } > > > > + > > > > + /* > > > > + * When set, TscFreqSel is read-only. Attempts to > > > > + * clear it are ignored. > > > > + */ > > > > + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); > > > > > > > > > The bit is read-only from the guest, but KVM needs to let userspace clear the > > > bit. > > > > Why? We don't let userspace clear bit 1 of EFLAGS, which is also a > > "reads as one" bit. > > Because that's architectural behavior, not dependent on FMS, and KVM needs to > write EFLAGS to have any hope of being useful, i.e. giving ownership of EFLAGS > to userspace is not a realistic option. Remind me what "MSR" stands for. :) > As proposed, if userspace sets CPUID to a magic FMS, and then changes the FMS to > something else, kvm_vcpu_after_set_cpuid() will not clear the bit and KVM will > end up wrongly enumerating the bit. I doubt userspace would ever do that, but > it's at least possible. > > That could be fixed by actively clearing vcpu->arch.msr_hwcr for other FMS values, > but then KVM would have to be 100% precise on the FMS matching, which would be a > maintenance nightmare. What if I did something crude like we do for MSR_IA32_MISC_ENABLE, and just set the bit at reset regardless of FMS: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cb02a7c2938b..4d7d0de42a9d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12086,6 +12086,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->arch.msr_misc_features_enables = 0; vcpu->arch.ia32_misc_enable_msr = MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | MSR_IA32_MISC_ENABLE_BTS_UNAVAIL; + vcpu_arch.msr_hwcr = BIT_ULL(24); __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP); __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true); > In other words, userspace owns the vCPU model, and for good reasons. KVM needs > to allow userspace to define a sane model, but with *very* few exceptions, KVM > should not try to "help" userspace by stuffing bits. Okay. What about the IA32_MISC_ENABLE bits above? > Pretty much everytime KVM tries to help, it causes problems. E.g. initializing > perf_capabilities to kvm_caps.supported_perf_cap seems like a good thing, except > it presents a bogus model if userspace decides to not enumerate a vPMU to the > guest (Aaron was allegedly going to send a patch for this...). KVM is nothing if not inconsistent. ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 18:27 ` Jim Mattson @ 2023-09-22 19:40 ` Sean Christopherson 2023-09-22 20:16 ` Jim Mattson 0 siblings, 1 reply; 10+ messages in thread From: Sean Christopherson @ 2023-09-22 19:40 UTC (permalink / raw) To: Jim Mattson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023, Jim Mattson wrote: > On Fri, Sep 22, 2023 at 11:15 AM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > > On Fri, Sep 22, 2023 at 10:21 AM Sean Christopherson <seanjc@google.com> wrote: > > > > IMO, we should delete the offending kernel code. I don't see how it provides any > > > > value these days. > > > > > > Sure, but that doesn't help legacy guests. > > > > Heh, IMO they don't need help, their owners just need to be placated ;-) > > > > > > And *if* we want to change something in KVM so that we stop getting coustomer > > > > complaints about a useless bit, just let userspace stuff the bit. > > > > > > We want to make customers happy. That should not even be a question. > > > > Can we really not tell them "this is a benign guest bug, ignore it"? > > What is the mechanism for doing that? Not my problem? :-D > > Because that's architectural behavior, not dependent on FMS, and KVM needs to > > write EFLAGS to have any hope of being useful, i.e. giving ownership of EFLAGS > > to userspace is not a realistic option. > > Remind me what "MSR" stands for. :) Heh, that acronym has long, long since lost all meaning. Joking aside, I think KVM needs to set a very, very high bare for emulating any part of any MSR that is truly model specific. IMO, it's far too likely that KVM will be the one left holding the bag in such situations. > > As proposed, if userspace sets CPUID to a magic FMS, and then changes the FMS to > > something else, kvm_vcpu_after_set_cpuid() will not clear the bit and KVM will > > end up wrongly enumerating the bit. I doubt userspace would ever do that, but > > it's at least possible. > > > > That could be fixed by actively clearing vcpu->arch.msr_hwcr for other FMS values, > > but then KVM would have to be 100% precise on the FMS matching, which would be a > > maintenance nightmare. > > What if I did something crude like we do for MSR_IA32_MISC_ENABLE, and > just set the bit at reset regardless of FMS: I'd prefer that over playing games with FMS. Though my first chioce would still be to punt the decision to userspace. > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index cb02a7c2938b..4d7d0de42a9d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -12086,6 +12086,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, > bool init_event) > vcpu->arch.msr_misc_features_enables = 0; > vcpu->arch.ia32_misc_enable_msr = > MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | > > MSR_IA32_MISC_ENABLE_BTS_UNAVAIL; > + vcpu_arch.msr_hwcr = BIT_ULL(24); > > __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP); > __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true); > > > In other words, userspace owns the vCPU model, and for good reasons. KVM needs > > to allow userspace to define a sane model, but with *very* few exceptions, KVM > > should not try to "help" userspace by stuffing bits. > > Okay. What about the IA32_MISC_ENABLE bits above? One of the exceptions where I don't see a better option, and hopefully something that Intel won't repeat in the future. Though I'm not exactly brimming with confidence that Intel won't retroactively add more "gotcha! unsupported!" bits in the future when they realize they forgot add a useful CPUID feature bit. > > Pretty much everytime KVM tries to help, it causes problems. E.g. initializing > > perf_capabilities to kvm_caps.supported_perf_cap seems like a good thing, except > > it presents a bogus model if userspace decides to not enumerate a vPMU to the > > guest (Aaron was allegedly going to send a patch for this...). > > KVM is nothing if not inconsistent. Yeah, the existing inconsistencies are painful, but that's not a good reason to continue what I see as bad behavior. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 19:40 ` Sean Christopherson @ 2023-09-22 20:16 ` Jim Mattson 2023-09-22 20:51 ` Sean Christopherson 0 siblings, 1 reply; 10+ messages in thread From: Jim Mattson @ 2023-09-22 20:16 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023 at 12:40 PM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > Okay. What about the IA32_MISC_ENABLE bits above? > > One of the exceptions where I don't see a better option, and hopefully something > that Intel won't repeat in the future. Though I'm not exactly brimming with > confidence that Intel won't retroactively add more "gotcha! unsupported!" bits > in the future when they realize they forgot add a useful CPUID feature bit. I don't understand the difference here. Why not make userspace responsible for setting these bits as well? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] 2023-09-22 20:16 ` Jim Mattson @ 2023-09-22 20:51 ` Sean Christopherson 0 siblings, 0 replies; 10+ messages in thread From: Sean Christopherson @ 2023-09-22 20:51 UTC (permalink / raw) To: Jim Mattson; +Cc: kvm, Paolo Bonzini On Fri, Sep 22, 2023, Jim Mattson wrote: > On Fri, Sep 22, 2023 at 12:40 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > > Okay. What about the IA32_MISC_ENABLE bits above? > > > > One of the exceptions where I don't see a better option, and hopefully something > > that Intel won't repeat in the future. Though I'm not exactly brimming with > > confidence that Intel won't retroactively add more "gotcha! unsupported!" bits > > in the future when they realize they forgot add a useful CPUID feature bit. > > I don't understand the difference here. Why not make userspace > responsible for setting these bits as well? That probably would have been the ideal approach. I'm not entirely sure it would have actually been feasible though, as I suspect enumerting X86_FEATURE_DS without any kind of guard would break userspace that reflects KVM_GET_SUPPORTED_CPUID back into KVM_SET_CPUID(2). Even better would have been to never merge PEBS support in KVM in its current form. The whole thing is a house of cards, e.g. if counters are "cross-mapped" then the guest counters simply stop working. And those warts aside, the entire enabling was a chaotic mess. See commit 9fc222967a39 ("KVM: x86: Give host userspace full control of MSR_IA32_MISC_ENABLES"). In other words, setting the UNAVAILABLE bits was the least awful way to salvage the mess. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] KVM: selftests: Test behavior of HWCR 2023-09-22 16:42 [PATCH 1/3] KVM: x86: Allow HWCR.McStatusWrEn to be cleared once set Jim Mattson 2023-09-22 16:42 ` [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] Jim Mattson @ 2023-09-22 16:42 ` Jim Mattson 1 sibling, 0 replies; 10+ messages in thread From: Jim Mattson @ 2023-09-22 16:42 UTC (permalink / raw) To: kvm, 'Sean Christopherson ', 'Paolo Bonzini '; +Cc: Jim Mattson Verify the following: * Any bits that read as one cannot be cleared * Attempts to set bits 3, 6, 8, or 24 are ignored * Bit 18 is the only bit that can be set * Any bit that can be set can also be cleared Signed-off-by: Jim Mattson <jmattson@google.com> --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/x86_64/hwcr_msr_test.c | 57 +++++++++++++++++++ 2 files changed, 58 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/hwcr_msr_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index a3bb36fb3cfc..6b0219ca65eb 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -135,6 +135,7 @@ TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test TEST_GEN_PROGS_x86_64 += system_counter_offset_test +TEST_GEN_PROGS_x86_64 += x86_64/hwcr_msr_test # Compiled outputs used by test targets TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test diff --git a/tools/testing/selftests/kvm/x86_64/hwcr_msr_test.c b/tools/testing/selftests/kvm/x86_64/hwcr_msr_test.c new file mode 100644 index 000000000000..123267b44daf --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/hwcr_msr_test.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023, Google LLC. + * + * Tests for the K7_HWCR MSR. + */ + +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <sys/ioctl.h> + +#include "test_util.h" +#include "kvm_util.h" +#include "vmx.h" + +void test_hwcr_bit(struct kvm_vcpu *vcpu, unsigned int bit) +{ + const unsigned long long ignored = + BIT_ULL(3) | BIT_ULL(6) | BIT_ULL(8) | BIT_ULL(24); + const unsigned long long legal = ignored | BIT_ULL(18); + uint64_t orig = vcpu_get_msr(vcpu, MSR_K7_HWCR); + uint64_t val = BIT_ULL(bit); + uint64_t check; + int r; + + r = _vcpu_set_msr(vcpu, MSR_K7_HWCR, val); + TEST_ASSERT((r == 1 && (val & legal)) || (r == 0 && !(val & legal)), + "Unexpected result (%d) when setting HWCR[bit %u]", r, bit); + check = vcpu_get_msr(vcpu, MSR_K7_HWCR); + if (val & (legal & ~ignored)) { + TEST_ASSERT(check == (orig | val), + "Bit %u: unexpected HWCR %lx; expected %lx", bit, + check, orig | val); + _vcpu_set_msr(vcpu, MSR_K7_HWCR, 0); + check = vcpu_get_msr(vcpu, MSR_K7_HWCR); + TEST_ASSERT(check == orig, + "Bit %u: unexpected HWCR %lx; expected %lx", bit, + check, orig); + } else { + TEST_ASSERT(check == orig, + "Bit %u: unexpected HWCR %lx; expected %lx", bit, + check, orig); + } +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + struct kvm_vcpu *vcpu; + unsigned int bit; + + vm = vm_create_with_one_vcpu(&vcpu, NULL); + + for (bit = 0; bit < BITS_PER_LONG; bit++) + test_hwcr_bit(vcpu, bit); + + kvm_vm_free(vm); +} -- 2.42.0.515.g380fc7ccd1-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-09-22 20:52 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-22 16:42 [PATCH 1/3] KVM: x86: Allow HWCR.McStatusWrEn to be cleared once set Jim Mattson 2023-09-22 16:42 ` [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24] Jim Mattson 2023-09-22 17:21 ` Sean Christopherson 2023-09-22 17:48 ` Jim Mattson 2023-09-22 18:15 ` Sean Christopherson 2023-09-22 18:27 ` Jim Mattson 2023-09-22 19:40 ` Sean Christopherson 2023-09-22 20:16 ` Jim Mattson 2023-09-22 20:51 ` Sean Christopherson 2023-09-22 16:42 ` [PATCH 3/3] KVM: selftests: Test behavior of HWCR Jim Mattson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox