All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	 Jonathan Corbet <corbet@lwn.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,  Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  "H. Peter Anvin" <hpa@zytor.com>,
	Paul Durrant <paul@xen.org>,
	Peter Zijlstra <peterz@infradead.org>,
	 Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	 Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	 Daniel Bristot de Oliveira <bristot@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	 linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	 jalliste@amazon.co.uk, sveith@amazon.de, zide.chen@intel.com,
	 Dongli Zhang <dongli.zhang@oracle.com>,
	Chenyi Qiang <chenyi.qiang@intel.com>
Subject: Re: [RFC PATCH v3 20/21] KVM: x86/xen: Prevent runstate times from becoming negative
Date: Thu, 15 Aug 2024 21:39:23 -0700	[thread overview]
Message-ID: <Zr7X-5qK8sRXxyDP@google.com> (raw)
In-Reply-To: <20240522001817.619072-21-dwmw2@infradead.org>

On Wed, May 22, 2024, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> When kvm_xen_update_runstate() is invoked to set a vCPU's runstate, the
> time spent in the previous runstate is accounted. This is based on the
> delta between the current KVM clock time, and the previous value stored
> in vcpu->arch.xen.runstate_entry_time.
> 
> If the KVM clock goes backwards, that delta will be negative. Or, since
> it's an unsigned 64-bit integer, very *large*. Linux guests deal with
> that particularly badly, reporting 100% steal time for ever more (well,
> for *centuries* at least, until the delta has been consumed).
> 
> So when a negative delta is detected, just refrain from updating the
> runstates until the KVM clock catches up with runstate_entry_time again.
> 
> The userspace APIs for setting the runstate times do not allow them to
> be set past the current KVM clock, but userspace can still adjust the
> KVM clock *after* setting the runstate times, which would cause this
> situation to occur.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/x86/kvm/xen.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 014048c22652..3d4111de4472 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -538,24 +538,34 @@ void kvm_xen_update_runstate(struct kvm_vcpu *v, int state)
>  {
>  	struct kvm_vcpu_xen *vx = &v->arch.xen;
>  	u64 now = get_kvmclock_ns(v->kvm);
> -	u64 delta_ns = now - vx->runstate_entry_time;
>  	u64 run_delay = current->sched_info.run_delay;
> +	s64 delta_ns = now - vx->runstate_entry_time;
> +	s64 steal_ns = run_delay - vx->last_steal;
>  
>  	if (unlikely(!vx->runstate_entry_time))
>  		vx->current_runstate = RUNSTATE_offline;
>  
> +	vx->last_steal = run_delay;
> +
> +	/*
> +	 * If KVM clock time went backwards, stop updating until it
> +	 * catches up (or the runstates are reset by userspace).
> +	 */

I take it this is a legitimate scenario where userpace sets KVM clock and then
the runstates, and KVM needs to lend a hand because userspace can't do those two
things atomically?

> +	if (delta_ns < 0)
> +		return;
> +
>  	/*
>  	 * Time waiting for the scheduler isn't "stolen" if the
>  	 * vCPU wasn't running anyway.
>  	 */
> -	if (vx->current_runstate == RUNSTATE_running) {
> -		u64 steal_ns = run_delay - vx->last_steal;
> +	if (vx->current_runstate == RUNSTATE_running && steal_ns > 0) {
> +		if (steal_ns > delta_ns)
> +			steal_ns = delta_ns;
>  
>  		delta_ns -= steal_ns;
>  
>  		vx->runstate_times[RUNSTATE_runnable] += steal_ns;
>  	}
> -	vx->last_steal = run_delay;
>  
>  	vx->runstate_times[vx->current_runstate] += delta_ns;
>  	vx->current_runstate = state;
> -- 
> 2.44.0
> 

  parent reply	other threads:[~2024-08-16  4:39 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-22  0:16 [RFC PATCH v3 00/21] Cleaning up the KVM clock mess David Woodhouse
2024-05-22  0:16 ` [RFC PATCH v3 01/21] KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() David Woodhouse
2024-05-22  0:16 ` [RFC PATCH v3 02/21] KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force David Woodhouse
2024-08-13 17:50   ` Sean Christopherson
2024-05-22  0:16 ` [RFC PATCH v3 03/21] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration David Woodhouse
2024-05-22  0:16 ` [RFC PATCH v3 04/21] UAPI: x86: Move pvclock-abi to UAPI for x86 platforms David Woodhouse
2024-05-24 13:14   ` Paul Durrant
2024-08-13 18:07   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 05/21] KVM: selftests: Add KVM/PV clock selftest to prove timer correction David Woodhouse
2024-08-13 18:55   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 06/21] KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC David Woodhouse
2024-05-22  0:17 ` [RFC PATCH v3 07/21] KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration David Woodhouse
2024-08-14  1:52   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 08/21] KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host David Woodhouse
2024-08-14  1:57   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 09/21] KVM: x86: Fix KVM clock precision in __get_kvmclock() David Woodhouse
2024-05-24 13:20   ` Paul Durrant
2024-08-14  2:58   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 10/21] KVM: x86: Fix software TSC upscaling in kvm_update_guest_time() David Woodhouse
2024-05-24 13:26   ` Paul Durrant
2024-08-14  4:57   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 11/21] KVM: x86: Simplify and comment kvm_get_time_scale() David Woodhouse
2024-05-24 13:53   ` Paul Durrant
2024-08-15 15:46   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 12/21] KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset() David Woodhouse
2024-05-24 13:56   ` Paul Durrant
2024-08-15 15:52   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 13/21] KVM: x86: Improve synchronization in kvm_synchronize_tsc() David Woodhouse
2024-05-24 14:03   ` Paul Durrant
2024-08-15 16:00   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 14/21] KVM: x86: Kill cur_tsc_{nsec,offset,write} fields David Woodhouse
2024-05-24 14:05   ` Paul Durrant
2024-08-15 16:30   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 15/21] KVM: x86: Allow KVM master clock mode when TSCs are offset from each other David Woodhouse
2024-05-24 14:10   ` Paul Durrant
2024-08-16  2:38   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 16/21] KVM: x86: Factor out kvm_use_master_clock() David Woodhouse
2024-05-24 14:13   ` Paul Durrant
2024-08-15 17:12   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 17/21] KVM: x86: Avoid global clock update on setting KVM clock MSR David Woodhouse
2024-05-24 14:14   ` Paul Durrant
2024-08-16  4:28   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 18/21] KVM: x86: Avoid gratuitous global clock reload in kvm_arch_vcpu_load() David Woodhouse
2024-05-24 14:16   ` Paul Durrant
2024-08-15 17:31   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 19/21] KVM: x86: Avoid periodic KVM clock updates in master clock mode David Woodhouse
2024-05-24 14:18   ` Paul Durrant
2024-08-16  4:33   ` Sean Christopherson
2024-05-22  0:17 ` [RFC PATCH v3 20/21] KVM: x86/xen: Prevent runstate times from becoming negative David Woodhouse
2024-05-24 14:21   ` Paul Durrant
2024-08-16  4:39   ` Sean Christopherson [this message]
2024-08-20 10:22     ` David Woodhouse
2024-08-20 15:08       ` Steven Rostedt
2024-08-20 15:42         ` David Woodhouse
2024-05-22  0:17 ` [RFC PATCH v3 21/21] sched/cputime: Cope with steal time going backwards or negative David Woodhouse
2024-05-24 14:25   ` Paul Durrant
2024-07-02 14:09   ` Shrikanth Hegde
2024-08-16  4:35   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zr7X-5qK8sRXxyDP@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=chenyi.qiang@intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dongli.zhang@oracle.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jalliste@amazon.co.uk \
    --cc=juri.lelli@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=sveith@amazon.de \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.