public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Like Xu <like.xu.linux@gmail.com>
To: Sean Christopherson <seanjc@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v6] KVM: x86/tsc: Don't sync user-written TSC against startup values
Date: Thu, 14 Sep 2023 11:50:02 +0800	[thread overview]
Message-ID: <3912b026-7cd2-9981-27eb-e8e37be9bbad@gmail.com> (raw)
In-Reply-To: <ZQHLcs3VGyLUb6wW@google.com>

On 13/9/2023 10:47 pm, Sean Christopherson wrote:
> On Wed, Sep 13, 2023, Like Xu wrote:
>> I'll wait for a cooling off period to see if the maintainers need me to post v7.
> 
> You should have waiting to post v5, let alone v6.  Resurrecting a thread after a
> month and not waiting even 7 hours for others to respond is extremely frustrating.

You are right. I don't seem to be keeping up with many of other issues. Sorry 
for that.
Wish there were 48 hours in a day.

Back to this issue: for commit message, I'd be more inclined to David's 
understanding,
but you have the gavel; and for proposed code diff, how about the following changes:

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1a4def36d5bb..9a7dfef9d32d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1324,6 +1324,7 @@ struct kvm_arch {
  	int nr_vcpus_matched_tsc;

  	u32 default_tsc_khz;
+	bool user_set_tsc;

  	seqcount_raw_spinlock_t pvclock_sc;
  	bool use_master_clock;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6c9c81e82e65..faaae8b3fec4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2714,8 +2714,9 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, 
u64 offset, u64 tsc,
  	kvm_track_tsc_matching(vcpu);
  }

-static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
+static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 *user_value)
  {
+	u64 data = user_value ? *user_value : 0;
  	struct kvm *kvm = vcpu->kvm;
  	u64 offset, ns, elapsed;
  	unsigned long flags;
@@ -2728,27 +2729,45 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, 
u64 data)
  	elapsed = ns - kvm->arch.last_tsc_nsec;

  	if (vcpu->arch.virtual_tsc_khz) {
+		/*
+		 * Force synchronization when creating or hotplugging a vCPU,
+		 * i.e. when the TSC value is '0', to help keep clocks stable.
+		 * If this is NOT a hotplug/creation case, skip synchronization
+		 * on the first write from userspace so as not to misconstrue
+		 * state restoration after live migration as an attempt from
+		 * userspace to synchronize.
+		 */
  		if (data == 0) {
-			/*
-			 * detection of vcpu initialization -- need to sync
-			 * with other vCPUs. This particularly helps to keep
-			 * kvm_clock stable after CPU hotplug
-			 */
  			synchronizing = true;
-		} else {
+		} else if (kvm->arch.user_set_tsc) {
  			u64 tsc_exp = kvm->arch.last_tsc_write +
  						nsec_to_cycles(vcpu, elapsed);
  			u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
  			/*
-			 * Special case: TSC write with a small delta (1 second)
-			 * of virtual cycle time against real time is
-			 * interpreted as an attempt to synchronize the CPU.
+			 * Here lies UAPI baggage: when a user-initiated TSC write has
+			 * a small delta (1 second) of virtual cycle time against the
+			 * previously set vCPU, we assume that they were intended to be
+			 * in sync and the delta was only due to the racy nature of the
+			 * legacy API.
+			 *
+			 * This trick falls down when restoring a guest which genuinely
+			 * has been running for less time than the 1 second of imprecision
+			 * which we allow for in the legacy API. In this case, the first
+			 * value written by userspace (on any vCPU) should not be subject
+			 * to this 'correction' to make it sync up with values that only
+			 * from the kernel's default vCPU creation. Make the 1-second slop
+			 * hack only trigger if the user_set_tsc flag is already set.
+			 *
+			 * The correct answer is for the VMM not to use the legacy API.
  			 */
  			synchronizing = data < tsc_exp + tsc_hz &&
  					data + tsc_hz > tsc_exp;
  		}
  	}

+	if (user_value)
+		kvm->arch.user_set_tsc = true;
+
  	/*
  	 * For a reliable TSC, we can match TSC offsets, and for an unstable
  	 * TSC, we add elapsed time in this computation.  We could let the
@@ -3777,7 +3796,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
  		break;
  	case MSR_IA32_TSC:
  		if (msr_info->host_initiated) {
-			kvm_synchronize_tsc(vcpu, data);
+			kvm_synchronize_tsc(vcpu, &data);
  		} else {
  			u64 adj = kvm_compute_l1_tsc_offset(vcpu, data) - vcpu->arch.l1_tsc_offset;
  			adjust_tsc_offset_guest(vcpu, adj);
@@ -5536,6 +5555,7 @@ static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
  		tsc = kvm_scale_tsc(rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset;
  		ns = get_kvmclock_base_ns();

+		kvm->arch.user_set_tsc = true;
  		__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched);
  		raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);

@@ -11959,7 +11979,7 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
  	if (mutex_lock_killable(&vcpu->mutex))
  		return;
  	vcpu_load(vcpu);
-	kvm_synchronize_tsc(vcpu, 0);
+	kvm_synchronize_tsc(vcpu, NULL);
  	vcpu_put(vcpu);

  	/* poll control enabled by default */
-- 
2.42.0

  reply	other threads:[~2023-09-14  3:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-13 10:37 [PATCH v6] KVM: x86/tsc: Don't sync user-written TSC against startup values Like Xu
2023-09-13 10:51 ` David Woodhouse
2023-09-13 10:59   ` Like Xu
2023-09-13 14:47     ` Sean Christopherson
2023-09-14  3:50       ` Like Xu [this message]
2023-09-14  7:31         ` David Woodhouse
2023-09-19 11:29           ` Like Xu
2023-09-25  7:36             ` Like Xu
2023-09-25  8:31               ` David Woodhouse
2023-10-07  7:29                 ` Like Xu
2023-10-07 12:51                   ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3912b026-7cd2-9981-27eb-e8e37be9bbad@gmail.com \
    --to=like.xu.linux@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox