* [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
@ 2022-04-14 18:31 Anton Romanov
2022-04-14 19:33 ` Sean Christopherson
0 siblings, 1 reply; 8+ messages in thread
From: Anton Romanov @ 2022-04-14 18:31 UTC (permalink / raw)
To: kvm, pbonzini; +Cc: seanjc, Anton Romanov
Don't snapshot tsc_khz into per-cpu cpu_tsc_khz if the
host TSC is constant, in which case the actual TSC frequency will never
change and thus capturing TSC during initialization is
unnecessary, KVM can simply use tsc_khz.
This value is snapshotted from
kvm_timer_init->kvmclock_cpu_online->tsc_khz_changed(NULL)
On CPUs with constant TSC, but not a hardware-specified TSC frequency,
snapshotting cpu_tsc_khz and using that to set a VM's target TSC
frequency can lead to VM to think its TSC frequency is not what it actually
is if refining the TSC completes after KVM snapshots tsc_khz. The
actual frequency never changes, only the kernel's calculation of what
that frequency is changes.
Ideally, KVM would not be able to race with TSC refinement, or would have
a hook into tsc_refine_calibration_work() to get an alert when refinement
is complete. Avoiding the race altogether isn't practical as refinement
takes a relative eternity; it's deliberately put on a work queue outside
of the normal boot sequence to avoid unnecessarily delaying boot.
Adding a hook is doable, but somewhat gross due to KVM's ability to be
built as a module. And if the TSC is constant, which is likely the case
for every VMX/SVM-capable CPU produced in the last decade, the race can
be hit if and only if userspace is able to create a VM before TSC
refinement completes; refinement is slow, but not that slow.
For now, punt on a proper fix, as not taking a snapshot can help some
uses cases and not taking a snapshot is arguably correct irrespective of
the race with refinement.
Signed-off-by: Anton Romanov <romanton@google.com>
---
arch/x86/kvm/x86.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 547ba00ef64f..4ae9a03f549d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2907,6 +2907,19 @@ static void kvm_update_masterclock(struct kvm *kvm)
kvm_end_pvclock_update(kvm);
}
+/*
+ * If kvm is built into kernel it is possible that tsc_khz saved into
+ * per-cpu cpu_tsc_khz was yet unrefined value. If CPU provides CONSTANT_TSC it
+ * doesn't make sense to snapshot it anyway so just return tsc_khz
+ */
+static unsigned long get_cpu_tsc_khz(void)
+{
+ if (static_cpu_has(X86_FEATURE_CONSTANT_TSC))
+ return tsc_khz;
+ else
+ return __this_cpu_read(cpu_tsc_khz);
+}
+
/* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */
static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
{
@@ -2917,7 +2930,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
get_cpu();
data->flags = 0;
- if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) {
+ if (ka->use_master_clock && get_cpu_tsc_khz()) {
#ifdef CONFIG_X86_64
struct timespec64 ts;
@@ -2931,7 +2944,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
data->flags |= KVM_CLOCK_TSC_STABLE;
hv_clock.tsc_timestamp = ka->master_cycle_now;
hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
- kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
+ kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL,
&hv_clock.tsc_shift,
&hv_clock.tsc_to_system_mul);
data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc);
@@ -3049,7 +3062,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
/* Keep irq disabled to prevent changes to the clock */
local_irq_save(flags);
- tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz);
+ tgt_tsc_khz = get_cpu_tsc_khz();
if (unlikely(tgt_tsc_khz == 0)) {
local_irq_restore(flags);
kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
@@ -8646,9 +8659,12 @@ static void tsc_khz_changed(void *data)
struct cpufreq_freqs *freq = data;
unsigned long khz = 0;
+ if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
+ return;
+
if (data)
khz = freq->new;
- else if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
+ else
khz = cpufreq_quick_get(raw_smp_processor_id());
if (!khz)
khz = tsc_khz;
--
2.36.0.rc0.470.gd361397f0d-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-14 18:31 [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant Anton Romanov
@ 2022-04-14 19:33 ` Sean Christopherson
2022-04-14 20:42 ` Anton Romanov
2022-04-19 7:46 ` Vitaly Kuznetsov
0 siblings, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-04-14 19:33 UTC (permalink / raw)
To: Anton Romanov; +Cc: kvm, pbonzini, Vitaly Kuznetsov
+Vitaly
On Thu, Apr 14, 2022, Anton Romanov wrote:
> Don't snapshot tsc_khz into per-cpu cpu_tsc_khz if the
> host TSC is constant, in which case the actual TSC frequency will never
> change and thus capturing TSC during initialization is
> unnecessary, KVM can simply use tsc_khz.
> This value is snapshotted from
> kvm_timer_init->kvmclock_cpu_online->tsc_khz_changed(NULL)
Nit, please wrap changelogs at ~75 chars. It's not a hard rule, e.g. if running
over or cutting early improves readability, then by all means. But wrapping
somewhat randomly makes reading the changelog unnecessarily difficult.
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 547ba00ef64f..4ae9a03f549d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2907,6 +2907,19 @@ static void kvm_update_masterclock(struct kvm *kvm)
> kvm_end_pvclock_update(kvm);
> }
>
> +/*
> + * If kvm is built into kernel it is possible that tsc_khz saved into
> + * per-cpu cpu_tsc_khz was yet unrefined value. If CPU provides CONSTANT_TSC it
> + * doesn't make sense to snapshot it anyway so just return tsc_khz
> + */
> +static unsigned long get_cpu_tsc_khz(void)
> +{
> + if (static_cpu_has(X86_FEATURE_CONSTANT_TSC))
> + return tsc_khz;
> + else
> + return __this_cpu_read(cpu_tsc_khz);
> +}
> +
> /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */
> static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> {
> @@ -2917,7 +2930,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> get_cpu();
>
> data->flags = 0;
> - if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) {
> + if (ka->use_master_clock && get_cpu_tsc_khz()) {
It might make sense to open code this to make it more obvious why the "else" path
exists. That'd also eliminate a condition branch on CPUs with a constant TSC,
though I don't know if we care that much about the performance here.
if (ka->use_master_clock &&
(static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz)))
And/or add a comment about cpu_tsc_khz being zero when the CPU is being offlined?
> #ifdef CONFIG_X86_64
> struct timespec64 ts;
>
...
> @@ -8646,9 +8659,12 @@ static void tsc_khz_changed(void *data)
> struct cpufreq_freqs *freq = data;
> unsigned long khz = 0;
>
> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
> + return;
Vitaly,
The Hyper-V guest code also sets cpu_tsc_khz, should we WARN if that notifier is
invoked and Hyper-V told us there's a constant TSC?
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab336f7c82e4..ca8e20f5ffc0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8701,6 +8701,8 @@ static void kvm_hyperv_tsc_notifier(void)
struct kvm *kvm;
int cpu;
+ WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_CONSTANT_TSC));
+
mutex_lock(&kvm_lock);
list_for_each_entry(kvm, &vm_list, vm_list)
kvm_make_mclock_inprogress_request(kvm);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-14 19:33 ` Sean Christopherson
@ 2022-04-14 20:42 ` Anton Romanov
2022-04-14 22:22 ` Sean Christopherson
2022-04-19 7:46 ` Vitaly Kuznetsov
1 sibling, 1 reply; 8+ messages in thread
From: Anton Romanov @ 2022-04-14 20:42 UTC (permalink / raw)
To: Sean Christopherson; +Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov
On Thu, Apr 14, 2022 at 12:33 PM Sean Christopherson <seanjc@google.com> wrote:
> On Thu, Apr 14, 2022, Anton Romanov wrote:
> > /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */
> > static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> > {
> > @@ -2917,7 +2930,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> > get_cpu();
> >
> > data->flags = 0;
> > - if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) {
> > + if (ka->use_master_clock && get_cpu_tsc_khz()) {
>
> It might make sense to open code this to make it more obvious why the "else" path
> exists. That'd also eliminate a condition branch on CPUs with a constant TSC,
> though I don't know if we care that much about the performance here.
>
> if (ka->use_master_clock &&
> (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz)))
>
> And/or add a comment about cpu_tsc_khz being zero when the CPU is being offlined?
It looks like cpu_tsc_khz being zero is used as an indicator of CPU
being unplugged here.
I don't think your proposed change is right in this case either.
How about we still keep tsc_khz_changed untouched as well as this line?
Potentially adding here a comment that on this line we only read it to
see if CPU is not being unplugged yet
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-14 20:42 ` Anton Romanov
@ 2022-04-14 22:22 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-04-14 22:22 UTC (permalink / raw)
To: Anton Romanov; +Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov
On Thu, Apr 14, 2022, Anton Romanov wrote:
> On Thu, Apr 14, 2022 at 12:33 PM Sean Christopherson <seanjc@google.com> wrote:
> > On Thu, Apr 14, 2022, Anton Romanov wrote:
> > > /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */
> > > static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> > > {
> > > @@ -2917,7 +2930,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
> > > get_cpu();
> > >
> > > data->flags = 0;
> > > - if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) {
> > > + if (ka->use_master_clock && get_cpu_tsc_khz()) {
> >
> > It might make sense to open code this to make it more obvious why the "else" path
> > exists. That'd also eliminate a condition branch on CPUs with a constant TSC,
> > though I don't know if we care that much about the performance here.
> >
> > if (ka->use_master_clock &&
> > (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz)))
> >
> > And/or add a comment about cpu_tsc_khz being zero when the CPU is being offlined?
>
> It looks like cpu_tsc_khz being zero is used as an indicator of CPU
> being unplugged here.
That's ok, the unplug issue was that kvm_get_time_scale() got stuck in an infinite
loop due to cpu_tsc_khz being zero. Using tsc_khz is ok even though the CPU is
about to go offline, the CPU going offline doesn't make the calculation wrong.
> I don't think your proposed change is right in this case either.
> How about we still keep tsc_khz_changed untouched as well as this line?
> Potentially adding here a comment that on this line we only read it to
> see if CPU is not being unplugged yet
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-14 19:33 ` Sean Christopherson
2022-04-14 20:42 ` Anton Romanov
@ 2022-04-19 7:46 ` Vitaly Kuznetsov
2022-04-19 15:39 ` Sean Christopherson
1 sibling, 1 reply; 8+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-19 7:46 UTC (permalink / raw)
To: Sean Christopherson, Anton Romanov; +Cc: kvm, pbonzini
Sean Christopherson <seanjc@google.com> writes:
> +Vitaly
>
> On Thu, Apr 14, 2022, Anton Romanov wrote:
...
>> @@ -8646,9 +8659,12 @@ static void tsc_khz_changed(void *data)
>> struct cpufreq_freqs *freq = data;
>> unsigned long khz = 0;
>>
>> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
>> + return;
>
> Vitaly,
>
> The Hyper-V guest code also sets cpu_tsc_khz, should we WARN if that notifier is
> invoked and Hyper-V told us there's a constant TSC?
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ab336f7c82e4..ca8e20f5ffc0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8701,6 +8701,8 @@ static void kvm_hyperv_tsc_notifier(void)
> struct kvm *kvm;
> int cpu;
>
> + WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_CONSTANT_TSC));
> +
> mutex_lock(&kvm_lock);
> list_for_each_entry(kvm, &vm_list, vm_list)
> kvm_make_mclock_inprogress_request(kvm);
>
(apologies for the delayed reply)
No, I think Hyper-V's "Reenlightenment" feature overrides (re-defines?)
X86_FEATURE_CONSTANT_TSC. E.g. I've checked a VM on E5-2667 v4
(Broadwell) CPU with no TSC scaling. This VM has 'constant_tsc' and will
certainly get reenlightenment irq on migration.
Note, Hyper-V has its own 'Invariant TSC control', see commit
dce7cd62754b5 ("x86/hyperv: Allow guests to enable InvariantTSC"). When
enabled, X86_FEATURE_TSC_RELIABLE is forced. I *think* (haven't checked
as I don't have two suitable hosts to test migration handy) this will
suppress reenlightenment so the check should be
WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_TSC_RELIABLE));
instead. There is a chance that reenlightenment notifications still
arrive but the reported 'new' TSC frequency remains unchanged (silly,
but possible), I'll need to check.
--
Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-19 7:46 ` Vitaly Kuznetsov
@ 2022-04-19 15:39 ` Sean Christopherson
2022-04-19 16:07 ` Vitaly Kuznetsov
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-04-19 15:39 UTC (permalink / raw)
To: Vitaly Kuznetsov; +Cc: Anton Romanov, kvm, pbonzini
On Tue, Apr 19, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
>
> > +Vitaly
> >
> > On Thu, Apr 14, 2022, Anton Romanov wrote:
>
> ...
>
> >> @@ -8646,9 +8659,12 @@ static void tsc_khz_changed(void *data)
> >> struct cpufreq_freqs *freq = data;
> >> unsigned long khz = 0;
> >>
> >> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
> >> + return;
> >
> > Vitaly,
> >
> > The Hyper-V guest code also sets cpu_tsc_khz, should we WARN if that notifier is
> > invoked and Hyper-V told us there's a constant TSC?
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index ab336f7c82e4..ca8e20f5ffc0 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8701,6 +8701,8 @@ static void kvm_hyperv_tsc_notifier(void)
> > struct kvm *kvm;
> > int cpu;
> >
> > + WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_CONSTANT_TSC));
> > +
> > mutex_lock(&kvm_lock);
> > list_for_each_entry(kvm, &vm_list, vm_list)
> > kvm_make_mclock_inprogress_request(kvm);
> >
>
> (apologies for the delayed reply)
>
> No, I think Hyper-V's "Reenlightenment" feature overrides (re-defines?)
> X86_FEATURE_CONSTANT_TSC. E.g. I've checked a VM on E5-2667 v4
> (Broadwell) CPU with no TSC scaling. This VM has 'constant_tsc' and will
> certainly get reenlightenment irq on migration.
Ooh, so that a VM with a constant TSC be live migrated to another system with a
constant, but different, TSC. Does the below look correct as fixup for this patch?
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab336f7c82e4..a944e4ba5532 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8708,10 +8708,18 @@ static void kvm_hyperv_tsc_notifier(void)
/* no guest entries from this point */
hyperv_stop_tsc_emulation();
- /* TSC frequency always matches when on Hyper-V */
- for_each_present_cpu(cpu)
- per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
- kvm_max_guest_tsc_khz = tsc_khz;
+ /*
+ * TSC frequency always matches when on Hyper-V. Skip the updates if
+ * the TSC is "officially" constant, in which case KVM doesn't use the
+ * per-CPU and max variables. Note, the notifier can still fire with
+ * a constant TSC, e.g. if this VM (KVM is a Hyper-V guest) is migrated
+ * to a system with a different TSC frequency.
+ */
+ if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
+ for_each_present_cpu(cpu)
+ per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
+ kvm_max_guest_tsc_khz = tsc_khz;
+ }
list_for_each_entry(kvm, &vm_list, vm_list) {
__kvm_start_pvclock_update(kvm);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-19 15:39 ` Sean Christopherson
@ 2022-04-19 16:07 ` Vitaly Kuznetsov
2022-04-19 16:13 ` Sean Christopherson
0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-19 16:07 UTC (permalink / raw)
To: Sean Christopherson; +Cc: Anton Romanov, kvm, pbonzini
Sean Christopherson <seanjc@google.com> writes:
> On Tue, Apr 19, 2022, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>>
>> > +Vitaly
>> >
>> > On Thu, Apr 14, 2022, Anton Romanov wrote:
>>
>> ...
>>
>> >> @@ -8646,9 +8659,12 @@ static void tsc_khz_changed(void *data)
>> >> struct cpufreq_freqs *freq = data;
>> >> unsigned long khz = 0;
>> >>
>> >> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
>> >> + return;
>> >
>> > Vitaly,
>> >
>> > The Hyper-V guest code also sets cpu_tsc_khz, should we WARN if that notifier is
>> > invoked and Hyper-V told us there's a constant TSC?
>> >
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index ab336f7c82e4..ca8e20f5ffc0 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -8701,6 +8701,8 @@ static void kvm_hyperv_tsc_notifier(void)
>> > struct kvm *kvm;
>> > int cpu;
>> >
>> > + WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_CONSTANT_TSC));
>> > +
>> > mutex_lock(&kvm_lock);
>> > list_for_each_entry(kvm, &vm_list, vm_list)
>> > kvm_make_mclock_inprogress_request(kvm);
>> >
>>
>> (apologies for the delayed reply)
>>
>> No, I think Hyper-V's "Reenlightenment" feature overrides (re-defines?)
>> X86_FEATURE_CONSTANT_TSC. E.g. I've checked a VM on E5-2667 v4
>> (Broadwell) CPU with no TSC scaling. This VM has 'constant_tsc' and will
>> certainly get reenlightenment irq on migration.
>
> Ooh, so that a VM with a constant TSC be live migrated to another system with a
> constant, but different, TSC. Does the below look correct as fixup for this patch?
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ab336f7c82e4..a944e4ba5532 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8708,10 +8708,18 @@ static void kvm_hyperv_tsc_notifier(void)
> /* no guest entries from this point */
> hyperv_stop_tsc_emulation();
>
> - /* TSC frequency always matches when on Hyper-V */
> - for_each_present_cpu(cpu)
> - per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
> - kvm_max_guest_tsc_khz = tsc_khz;
> + /*
> + * TSC frequency always matches when on Hyper-V. Skip the updates if
> + * the TSC is "officially" constant, in which case KVM doesn't use the
> + * per-CPU and max variables. Note, the notifier can still fire with
> + * a constant TSC, e.g. if this VM (KVM is a Hyper-V guest) is migrated
> + * to a system with a different TSC frequency.
> + */
> + if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
> + for_each_present_cpu(cpu)
> + per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
> + kvm_max_guest_tsc_khz = tsc_khz;
> + }
Looks good for cpu_tsc_khz but I'm not particularly sure about
kvm_max_guest_tsc_khz.
AFAIU, kvm_max_guest_tsc_khz is normally only set when TSC scaling is
available (kvm_has_tsc_control) and Hyper-V wasn't exposing it as the
field was missing in Enlightened VMCS. The most recent version
(https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/datatypes/hv_vmx_enlightened_vmcs),
however, has 'TscMultiplier' so I guess it's possible now. (Note: eVMCS
version remains '1', just a few fields were added, interesting).
Another thing is why do we set kvm_max_guest_tsc_khz to 'tsc_khz' as
normally, this is the maximum possible guest-VM TSC frequency (see
kvm_arch_hardware_setup()). With reenlightenment, the VM can be migrated
to a host with different TSC frequency, this means the maximum possible
guest-VM TSC frequency may change. What do we do with L2 VM with
'unsupported' configurations after that?
TL;DR: I *think* we can drop kvm_max_guest_tsc_khz setting from
kvm_hyperv_tsc_notifier() for now as it a) doesn't seem to be needed for
non-tscscaling case and b) correct for the tsc-scaling case. I'll need
to investigate how recent Hyper-V versions work when CPU offers
TSC-scaling feature.
>
> list_for_each_entry(kvm, &vm_list, vm_list) {
> __kvm_start_pvclock_update(kvm);
>
--
Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant
2022-04-19 16:07 ` Vitaly Kuznetsov
@ 2022-04-19 16:13 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-04-19 16:13 UTC (permalink / raw)
To: Vitaly Kuznetsov; +Cc: Anton Romanov, kvm, pbonzini
On Tue, Apr 19, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
>
> > On Tue, Apr 19, 2022, Vitaly Kuznetsov wrote:
> >> Sean Christopherson <seanjc@google.com> writes:
> >> > The Hyper-V guest code also sets cpu_tsc_khz, should we WARN if that notifier is
> >> > invoked and Hyper-V told us there's a constant TSC?
...
> >> (apologies for the delayed reply)
> >>
> >> No, I think Hyper-V's "Reenlightenment" feature overrides (re-defines?)
> >> X86_FEATURE_CONSTANT_TSC. E.g. I've checked a VM on E5-2667 v4
> >> (Broadwell) CPU with no TSC scaling. This VM has 'constant_tsc' and will
> >> certainly get reenlightenment irq on migration.
> >
> > Ooh, so that a VM with a constant TSC be live migrated to another system with a
> > constant, but different, TSC. Does the below look correct as fixup for this patch?
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index ab336f7c82e4..a944e4ba5532 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8708,10 +8708,18 @@ static void kvm_hyperv_tsc_notifier(void)
> > /* no guest entries from this point */
> > hyperv_stop_tsc_emulation();
> >
> > - /* TSC frequency always matches when on Hyper-V */
> > - for_each_present_cpu(cpu)
> > - per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
> > - kvm_max_guest_tsc_khz = tsc_khz;
> > + /*
> > + * TSC frequency always matches when on Hyper-V. Skip the updates if
> > + * the TSC is "officially" constant, in which case KVM doesn't use the
> > + * per-CPU and max variables. Note, the notifier can still fire with
> > + * a constant TSC, e.g. if this VM (KVM is a Hyper-V guest) is migrated
> > + * to a system with a different TSC frequency.
> > + */
> > + if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
> > + for_each_present_cpu(cpu)
> > + per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
> > + kvm_max_guest_tsc_khz = tsc_khz;
> > + }
>
> Looks good for cpu_tsc_khz but I'm not particularly sure about
> kvm_max_guest_tsc_khz.
Doh, ignore that, I got kvm_max_guest_tsc_khz confused with max_tsc_khz.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-04-19 16:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-14 18:31 [PATCH] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant Anton Romanov
2022-04-14 19:33 ` Sean Christopherson
2022-04-14 20:42 ` Anton Romanov
2022-04-14 22:22 ` Sean Christopherson
2022-04-19 7:46 ` Vitaly Kuznetsov
2022-04-19 15:39 ` Sean Christopherson
2022-04-19 16:07 ` Vitaly Kuznetsov
2022-04-19 16:13 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox