public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time.
@ 2024-08-20  4:35 Suleiman Souhlal
  2024-08-20  4:35 ` [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns() Suleiman Souhlal
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-20  4:35 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chao Gao, kvm, linux-kernel, ssouhlal,
	Suleiman Souhlal

This series makes it so that the time that the host is suspended is
included in guests' steal time.

When the host resumes from a suspend, the guest thinks any task
that was running during the suspend ran for a long time, even though
the effective run time was much shorter, which can end up having
negative effects with scheduling. This can be particularly noticeable
if the guest task was RT, as it can end up getting throttled for a
long time.

To mitigate this issue, we include the time that the host was
suspended in steal time, which lets the guest can subtract the
duration from the tasks' runtime.

(v1 was at https://lore.kernel.org/kvm/20240710074410.770409-1-suleiman@google.com/)

v1 -> v2:
- Accumulate suspend time at machine-independent kvm layer and track per-VCPU
  instead of per-VM.
- Document changes.

Suleiman Souhlal (3):
  KVM: Introduce kvm_total_suspend_ns().
  KVM: x86: Include host suspended time in steal time.
  KVM: x86: Document host suspend being included in steal time.

 Documentation/virt/kvm/x86/msr.rst |  6 ++++--
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/x86.c                 | 11 ++++++++++-
 include/linux/kvm_host.h           |  2 ++
 virt/kvm/kvm_main.c                | 13 +++++++++++++
 5 files changed, 30 insertions(+), 3 deletions(-)

-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns().
  2024-08-20  4:35 [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
@ 2024-08-20  4:35 ` Suleiman Souhlal
  2024-08-21  5:40   ` Chao Gao
  2024-08-20  4:35 ` [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-20  4:35 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chao Gao, kvm, linux-kernel, ssouhlal,
	Suleiman Souhlal

It returns the cumulative nanoseconds that the host has been suspended.
It is intended to be used for reporting host suspend time to the guest.

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c      | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b23c6d48392f7c..8fec37b372d8c0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2494,4 +2494,6 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
 				    struct kvm_pre_fault_memory *range);
 #endif
 
+u64 kvm_total_suspend_ns(void);
+
 #endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cb2b78e92910fb..2235933d9247bc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5720,6 +5720,15 @@ static void kvm_shutdown(void)
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 }
 
+static u64 last_suspend;
+static u64 total_suspend_ns;
+
+u64
+kvm_total_suspend_ns(void)
+{
+	return total_suspend_ns;
+}
+
 static int kvm_suspend(void)
 {
 	/*
@@ -5735,6 +5744,8 @@ static int kvm_suspend(void)
 
 	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
+
+	last_suspend = ktime_get_boottime_ns();
 	return 0;
 }
 
@@ -5745,6 +5756,8 @@ static void kvm_resume(void)
 
 	if (kvm_usage_count)
 		WARN_ON_ONCE(__hardware_enable_nolock());
+
+	total_suspend_ns += ktime_get_boottime_ns() - last_suspend;
 }
 
 static struct syscore_ops kvm_syscore_ops = {
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-20  4:35 [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
  2024-08-20  4:35 ` [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns() Suleiman Souhlal
@ 2024-08-20  4:35 ` Suleiman Souhlal
  2024-08-21  6:31   ` Chao Gao
  2024-08-28  9:56   ` Suleiman Souhlal
  2024-08-20  4:35 ` [PATCH v2 3/3] KVM: x86: Document host suspend being included " Suleiman Souhlal
  2024-09-25 13:54 ` [PATCH v2 0/3] KVM: x86: Include host suspended time " Sean Christopherson
  3 siblings, 2 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-20  4:35 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chao Gao, kvm, linux-kernel, ssouhlal,
	Suleiman Souhlal

When the host resumes from a suspend, the guest thinks any task
that was running during the suspend ran for a long time, even though
the effective run time was much shorter, which can end up having
negative effects with scheduling. This can be particularly noticeable
if the guest task was RT, as it can end up getting throttled for a
long time.

To mitigate this issue, we include the time that the host was
suspended in steal time, which lets the guest subtract the duration from
the tasks' runtime.

Note that the case of a suspend happening during a VM migration
might not be accounted.

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 11 ++++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4a68cb3eba78f8..728798decb6d12 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -898,6 +898,7 @@ struct kvm_vcpu_arch {
 		u8 preempted;
 		u64 msr_val;
 		u64 last_steal;
+		u64 last_suspend_ns;
 		struct gfn_to_hva_cache cache;
 	} st;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 70219e4069874a..104f3d318026fa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3654,7 +3654,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	struct kvm_steal_time __user *st;
 	struct kvm_memslots *slots;
 	gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
-	u64 steal;
+	u64 steal, suspend_ns;
 	u32 version;
 
 	if (kvm_xen_msr_enabled(vcpu->kvm)) {
@@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	steal += current->sched_info.run_delay -
 		vcpu->arch.st.last_steal;
 	vcpu->arch.st.last_steal = current->sched_info.run_delay;
+	/*
+	 * Include the time that the host was suspended in steal time.
+	 * Note that the case of a suspend happening during a VM migration
+	 * might not be accounted.
+	 */
+	suspend_ns = kvm_total_suspend_ns();
+	steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
+	vcpu->arch.st.last_suspend_ns = suspend_ns;
 	unsafe_put_user(steal, &st->steal, out);
 
 	version += 1;
@@ -12280,6 +12288,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
 	vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
+	vcpu->arch.st.last_suspend_ns = kvm_total_suspend_ns();
 	kvm_xen_init_vcpu(vcpu);
 	vcpu_load(vcpu);
 	kvm_set_tsc_khz(vcpu, vcpu->kvm->arch.default_tsc_khz);
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] KVM: x86: Document host suspend being included in steal time.
  2024-08-20  4:35 [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
  2024-08-20  4:35 ` [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns() Suleiman Souhlal
  2024-08-20  4:35 ` [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
@ 2024-08-20  4:35 ` Suleiman Souhlal
  2024-09-25 13:54 ` [PATCH v2 0/3] KVM: x86: Include host suspended time " Sean Christopherson
  3 siblings, 0 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-20  4:35 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chao Gao, kvm, linux-kernel, ssouhlal,
	Suleiman Souhlal

Steal time now includes the time that the host was suspended.

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
---
 Documentation/virt/kvm/x86/msr.rst | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/x86/msr.rst b/Documentation/virt/kvm/x86/msr.rst
index 3aecf2a70e7b43..81c17c2200ca2f 100644
--- a/Documentation/virt/kvm/x86/msr.rst
+++ b/Documentation/virt/kvm/x86/msr.rst
@@ -294,8 +294,10 @@ data:
 
 	steal:
 		the amount of time in which this vCPU did not run, in
-		nanoseconds. Time during which the vcpu is idle, will not be
-		reported as steal time.
+		nanoseconds. This includes the time during which the host is
+		suspended. However, the case where the host suspends during a
+		VM migration might not be correctly accounted. Time during
+		which the vcpu is idle, will not be reported as steal time.
 
 	preempted:
 		indicate the vCPU who owns this struct is running or
-- 
2.46.0.184.g6999bdac58-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns().
  2024-08-20  4:35 ` [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns() Suleiman Souhlal
@ 2024-08-21  5:40   ` Chao Gao
  2024-08-21  6:01     ` Suleiman Souhlal
  0 siblings, 1 reply; 12+ messages in thread
From: Chao Gao @ 2024-08-21  5:40 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Tue, Aug 20, 2024 at 01:35:41PM +0900, Suleiman Souhlal wrote:
>It returns the cumulative nanoseconds that the host has been suspended.
>It is intended to be used for reporting host suspend time to the guest.
>
>Signed-off-by: Suleiman Souhlal <suleiman@google.com>

Reviewed-by: Chao Gao <chao.gao@intel.com>

one nit below

>---
> include/linux/kvm_host.h |  2 ++
> virt/kvm/kvm_main.c      | 13 +++++++++++++
> 2 files changed, 15 insertions(+)
>
>diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>index b23c6d48392f7c..8fec37b372d8c0 100644
>--- a/include/linux/kvm_host.h
>+++ b/include/linux/kvm_host.h
>@@ -2494,4 +2494,6 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> 				    struct kvm_pre_fault_memory *range);
> #endif
> 
>+u64 kvm_total_suspend_ns(void);
>+
> #endif
>diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>index cb2b78e92910fb..2235933d9247bc 100644
>--- a/virt/kvm/kvm_main.c
>+++ b/virt/kvm/kvm_main.c
>@@ -5720,6 +5720,15 @@ static void kvm_shutdown(void)
> 	on_each_cpu(hardware_disable_nolock, NULL, 1);
> }
> 
>+static u64 last_suspend;
>+static u64 total_suspend_ns;
>+
>+u64
>+kvm_total_suspend_ns(void)

nit: don't wrap before the function name.

>+{
>+	return total_suspend_ns;
>+}
>+
> static int kvm_suspend(void)
> {
> 	/*
>@@ -5735,6 +5744,8 @@ static int kvm_suspend(void)
> 
> 	if (kvm_usage_count)
> 		hardware_disable_nolock(NULL);
>+
>+	last_suspend = ktime_get_boottime_ns();
> 	return 0;
> }
> 
>@@ -5745,6 +5756,8 @@ static void kvm_resume(void)
> 
> 	if (kvm_usage_count)
> 		WARN_ON_ONCE(__hardware_enable_nolock());
>+
>+	total_suspend_ns += ktime_get_boottime_ns() - last_suspend;
> }
> 
> static struct syscore_ops kvm_syscore_ops = {
>-- 
>2.46.0.184.g6999bdac58-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns().
  2024-08-21  5:40   ` Chao Gao
@ 2024-08-21  6:01     ` Suleiman Souhlal
  0 siblings, 0 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-21  6:01 UTC (permalink / raw)
  To: Chao Gao
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Wed, Aug 21, 2024 at 2:40 PM Chao Gao <chao.gao@intel.com> wrote:
>
> On Tue, Aug 20, 2024 at 01:35:41PM +0900, Suleiman Souhlal wrote:
> >It returns the cumulative nanoseconds that the host has been suspended.
> >It is intended to be used for reporting host suspend time to the guest.
> >
> >Signed-off-by: Suleiman Souhlal <suleiman@google.com>
>
> Reviewed-by: Chao Gao <chao.gao@intel.com>
>
> one nit below
>
> >---
> > include/linux/kvm_host.h |  2 ++
> > virt/kvm/kvm_main.c      | 13 +++++++++++++
> > 2 files changed, 15 insertions(+)
> >
> >diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >index b23c6d48392f7c..8fec37b372d8c0 100644
> >--- a/include/linux/kvm_host.h
> >+++ b/include/linux/kvm_host.h
> >@@ -2494,4 +2494,6 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> >                                   struct kvm_pre_fault_memory *range);
> > #endif
> >
> >+u64 kvm_total_suspend_ns(void);
> >+
> > #endif
> >diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >index cb2b78e92910fb..2235933d9247bc 100644
> >--- a/virt/kvm/kvm_main.c
> >+++ b/virt/kvm/kvm_main.c
> >@@ -5720,6 +5720,15 @@ static void kvm_shutdown(void)
> >       on_each_cpu(hardware_disable_nolock, NULL, 1);
> > }
> >
> >+static u64 last_suspend;
> >+static u64 total_suspend_ns;
> >+
> >+u64
> >+kvm_total_suspend_ns(void)
>
> nit: don't wrap before the function name.

Sorry, I completely missed that, even after Sean told me.
Force of habit (FreeBSD style(9) says you have to do it).

If I send another version I will fix that.

-- Suleiman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-20  4:35 ` [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
@ 2024-08-21  6:31   ` Chao Gao
  2024-08-23  4:17     ` Suleiman Souhlal
  2024-08-28  9:56   ` Suleiman Souhlal
  1 sibling, 1 reply; 12+ messages in thread
From: Chao Gao @ 2024-08-21  6:31 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Tue, Aug 20, 2024 at 01:35:42PM +0900, Suleiman Souhlal wrote:
>When the host resumes from a suspend, the guest thinks any task
>that was running during the suspend ran for a long time, even though
>the effective run time was much shorter, which can end up having
>negative effects with scheduling. This can be particularly noticeable
>if the guest task was RT, as it can end up getting throttled for a
>long time.
>
>To mitigate this issue, we include the time that the host was
>suspended in steal time, which lets the guest subtract the duration from
>the tasks' runtime.
>
>Note that the case of a suspend happening during a VM migration
>might not be accounted.
>
>Signed-off-by: Suleiman Souhlal <suleiman@google.com>
>---
> arch/x86/include/asm/kvm_host.h |  1 +
> arch/x86/kvm/x86.c              | 11 ++++++++++-
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>index 4a68cb3eba78f8..728798decb6d12 100644
>--- a/arch/x86/include/asm/kvm_host.h
>+++ b/arch/x86/include/asm/kvm_host.h
>@@ -898,6 +898,7 @@ struct kvm_vcpu_arch {
> 		u8 preempted;
> 		u64 msr_val;
> 		u64 last_steal;
>+		u64 last_suspend_ns;
> 		struct gfn_to_hva_cache cache;
> 	} st;
> 
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 70219e4069874a..104f3d318026fa 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -3654,7 +3654,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> 	struct kvm_steal_time __user *st;
> 	struct kvm_memslots *slots;
> 	gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
>-	u64 steal;
>+	u64 steal, suspend_ns;
> 	u32 version;
> 
> 	if (kvm_xen_msr_enabled(vcpu->kvm)) {
>@@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> 	steal += current->sched_info.run_delay -
> 		vcpu->arch.st.last_steal;
> 	vcpu->arch.st.last_steal = current->sched_info.run_delay;
>+	/*
>+	 * Include the time that the host was suspended in steal time.
>+	 * Note that the case of a suspend happening during a VM migration
>+	 * might not be accounted.
>+	 */
>+	suspend_ns = kvm_total_suspend_ns();
>+	steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
>+	vcpu->arch.st.last_suspend_ns = suspend_ns;

The document in patch 3 states:

  Time during which the vcpu is idle, will not be reported as steal time

I'm wondering if all host suspend time should be reported as steal time,
or if the suspend time during a vCPU halt should be excluded.

> 	unsafe_put_user(steal, &st->steal, out);
> 
> 	version += 1;
>@@ -12280,6 +12288,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> 
> 	vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
> 	vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
>+	vcpu->arch.st.last_suspend_ns = kvm_total_suspend_ns();

is this necessary? I doubt this because KVM doesn't capture
current->sched_info.run_delay here.

> 	kvm_xen_init_vcpu(vcpu);
> 	vcpu_load(vcpu);
> 	kvm_set_tsc_khz(vcpu, vcpu->kvm->arch.default_tsc_khz);
>-- 
>2.46.0.184.g6999bdac58-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-21  6:31   ` Chao Gao
@ 2024-08-23  4:17     ` Suleiman Souhlal
  2024-08-23  5:25       ` Chao Gao
  0 siblings, 1 reply; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-23  4:17 UTC (permalink / raw)
  To: Chao Gao
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Wed, Aug 21, 2024 at 3:31 PM Chao Gao <chao.gao@intel.com> wrote:
>
> On Tue, Aug 20, 2024 at 01:35:42PM +0900, Suleiman Souhlal wrote:
> >When the host resumes from a suspend, the guest thinks any task
> >that was running during the suspend ran for a long time, even though
> >the effective run time was much shorter, which can end up having
> >negative effects with scheduling. This can be particularly noticeable
> >if the guest task was RT, as it can end up getting throttled for a
> >long time.
> >
> >To mitigate this issue, we include the time that the host was
> >suspended in steal time, which lets the guest subtract the duration from
> >the tasks' runtime.
> >
> >Note that the case of a suspend happening during a VM migration
> >might not be accounted.
> >
> >Signed-off-by: Suleiman Souhlal <suleiman@google.com>
> >---
> > arch/x86/include/asm/kvm_host.h |  1 +
> > arch/x86/kvm/x86.c              | 11 ++++++++++-
> > 2 files changed, 11 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index 4a68cb3eba78f8..728798decb6d12 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -898,6 +898,7 @@ struct kvm_vcpu_arch {
> >               u8 preempted;
> >               u64 msr_val;
> >               u64 last_steal;
> >+              u64 last_suspend_ns;
> >               struct gfn_to_hva_cache cache;
> >       } st;
> >
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 70219e4069874a..104f3d318026fa 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -3654,7 +3654,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >       struct kvm_steal_time __user *st;
> >       struct kvm_memslots *slots;
> >       gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
> >-      u64 steal;
> >+      u64 steal, suspend_ns;
> >       u32 version;
> >
> >       if (kvm_xen_msr_enabled(vcpu->kvm)) {
> >@@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >       steal += current->sched_info.run_delay -
> >               vcpu->arch.st.last_steal;
> >       vcpu->arch.st.last_steal = current->sched_info.run_delay;
> >+      /*
> >+       * Include the time that the host was suspended in steal time.
> >+       * Note that the case of a suspend happening during a VM migration
> >+       * might not be accounted.
> >+       */
> >+      suspend_ns = kvm_total_suspend_ns();
> >+      steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
> >+      vcpu->arch.st.last_suspend_ns = suspend_ns;
>
> The document in patch 3 states:
>
>   Time during which the vcpu is idle, will not be reported as steal time
>
> I'm wondering if all host suspend time should be reported as steal time,
> or if the suspend time during a vCPU halt should be excluded.

I think the statement about idle time not being reported as steal isn't
completely accurate, so I'm not sure if it's worth the extra complexity.

>
> >       unsafe_put_user(steal, &st->steal, out);
> >
> >       version += 1;
> >@@ -12280,6 +12288,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >
> >       vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
> >       vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
> >+      vcpu->arch.st.last_suspend_ns = kvm_total_suspend_ns();
>
> is this necessary? I doubt this because KVM doesn't capture
> current->sched_info.run_delay here.

Isn't run_delay being captured by the scheduler at all time?

We need to initialize last_suspend_ns otherwise the first call to
record_steal_time() for a VCPU would report a wrong value if
the VCPU is started after the host has already had a suspend.

Thanks,
-- Suleiman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-23  4:17     ` Suleiman Souhlal
@ 2024-08-23  5:25       ` Chao Gao
  2024-08-23  5:43         ` Suleiman Souhlal
  0 siblings, 1 reply; 12+ messages in thread
From: Chao Gao @ 2024-08-23  5:25 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Fri, Aug 23, 2024 at 01:17:31PM +0900, Suleiman Souhlal wrote:
>On Wed, Aug 21, 2024 at 3:31 PM Chao Gao <chao.gao@intel.com> wrote:
>>
>> On Tue, Aug 20, 2024 at 01:35:42PM +0900, Suleiman Souhlal wrote:
>> >When the host resumes from a suspend, the guest thinks any task
>> >that was running during the suspend ran for a long time, even though
>> >the effective run time was much shorter, which can end up having
>> >negative effects with scheduling. This can be particularly noticeable
>> >if the guest task was RT, as it can end up getting throttled for a
>> >long time.
>> >
>> >To mitigate this issue, we include the time that the host was
>> >suspended in steal time, which lets the guest subtract the duration from
>> >the tasks' runtime.
>> >
>> >Note that the case of a suspend happening during a VM migration
>> >might not be accounted.
>> >
>> >Signed-off-by: Suleiman Souhlal <suleiman@google.com>
>> >---
>> > arch/x86/include/asm/kvm_host.h |  1 +
>> > arch/x86/kvm/x86.c              | 11 ++++++++++-
>> > 2 files changed, 11 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> >index 4a68cb3eba78f8..728798decb6d12 100644
>> >--- a/arch/x86/include/asm/kvm_host.h
>> >+++ b/arch/x86/include/asm/kvm_host.h
>> >@@ -898,6 +898,7 @@ struct kvm_vcpu_arch {
>> >               u8 preempted;
>> >               u64 msr_val;
>> >               u64 last_steal;
>> >+              u64 last_suspend_ns;
>> >               struct gfn_to_hva_cache cache;
>> >       } st;
>> >
>> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >index 70219e4069874a..104f3d318026fa 100644
>> >--- a/arch/x86/kvm/x86.c
>> >+++ b/arch/x86/kvm/x86.c
>> >@@ -3654,7 +3654,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
>> >       struct kvm_steal_time __user *st;
>> >       struct kvm_memslots *slots;
>> >       gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
>> >-      u64 steal;
>> >+      u64 steal, suspend_ns;
>> >       u32 version;
>> >
>> >       if (kvm_xen_msr_enabled(vcpu->kvm)) {
>> >@@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
>> >       steal += current->sched_info.run_delay -
>> >               vcpu->arch.st.last_steal;
>> >       vcpu->arch.st.last_steal = current->sched_info.run_delay;
>> >+      /*
>> >+       * Include the time that the host was suspended in steal time.
>> >+       * Note that the case of a suspend happening during a VM migration
>> >+       * might not be accounted.
>> >+       */
>> >+      suspend_ns = kvm_total_suspend_ns();
>> >+      steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
>> >+      vcpu->arch.st.last_suspend_ns = suspend_ns;
>>
>> The document in patch 3 states:
>>
>>   Time during which the vcpu is idle, will not be reported as steal time
>>
>> I'm wondering if all host suspend time should be reported as steal time,
>> or if the suspend time during a vCPU halt should be excluded.
>
>I think the statement about idle time not being reported as steal isn't
>completely accurate, so I'm not sure if it's worth the extra complexity.
>
>>
>> >       unsafe_put_user(steal, &st->steal, out);
>> >
>> >       version += 1;
>> >@@ -12280,6 +12288,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>> >
>> >       vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
>> >       vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
>> >+      vcpu->arch.st.last_suspend_ns = kvm_total_suspend_ns();
>>
>> is this necessary? I doubt this because KVM doesn't capture
>> current->sched_info.run_delay here.
>
>Isn't run_delay being captured by the scheduler at all time?

I meant KVM doesn't do:

	vcpu->arch.st.last_steal = current->sched_info.run_delay;

at vCPU creation time.

>
>We need to initialize last_suspend_ns otherwise the first call to
>record_steal_time() for a VCPU would report a wrong value if
>the VCPU is started after the host has already had a suspend.

But initializing last_suspend_ns here doesn't guarantee KVM won't report a
"wrong" value because a suspend can happen after vCPU creation and before
its first VM-enter.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-23  5:25       ` Chao Gao
@ 2024-08-23  5:43         ` Suleiman Souhlal
  0 siblings, 0 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-23  5:43 UTC (permalink / raw)
  To: Chao Gao
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
	linux-kernel, ssouhlal

On Fri, Aug 23, 2024 at 2:25 PM Chao Gao <chao.gao@intel.com> wrote:
>
> On Fri, Aug 23, 2024 at 01:17:31PM +0900, Suleiman Souhlal wrote:
> >On Wed, Aug 21, 2024 at 3:31 PM Chao Gao <chao.gao@intel.com> wrote:
> >>
> >> On Tue, Aug 20, 2024 at 01:35:42PM +0900, Suleiman Souhlal wrote:
> >> >When the host resumes from a suspend, the guest thinks any task
> >> >that was running during the suspend ran for a long time, even though
> >> >the effective run time was much shorter, which can end up having
> >> >negative effects with scheduling. This can be particularly noticeable
> >> >if the guest task was RT, as it can end up getting throttled for a
> >> >long time.
> >> >
> >> >To mitigate this issue, we include the time that the host was
> >> >suspended in steal time, which lets the guest subtract the duration from
> >> >the tasks' runtime.
> >> >
> >> >Note that the case of a suspend happening during a VM migration
> >> >might not be accounted.
> >> >
> >> >Signed-off-by: Suleiman Souhlal <suleiman@google.com>
> >> >---
> >> > arch/x86/include/asm/kvm_host.h |  1 +
> >> > arch/x86/kvm/x86.c              | 11 ++++++++++-
> >> > 2 files changed, 11 insertions(+), 1 deletion(-)
> >> >
> >> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >> >index 4a68cb3eba78f8..728798decb6d12 100644
> >> >--- a/arch/x86/include/asm/kvm_host.h
> >> >+++ b/arch/x86/include/asm/kvm_host.h
> >> >@@ -898,6 +898,7 @@ struct kvm_vcpu_arch {
> >> >               u8 preempted;
> >> >               u64 msr_val;
> >> >               u64 last_steal;
> >> >+              u64 last_suspend_ns;
> >> >               struct gfn_to_hva_cache cache;
> >> >       } st;
> >> >
> >> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> >index 70219e4069874a..104f3d318026fa 100644
> >> >--- a/arch/x86/kvm/x86.c
> >> >+++ b/arch/x86/kvm/x86.c
> >> >@@ -3654,7 +3654,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >> >       struct kvm_steal_time __user *st;
> >> >       struct kvm_memslots *slots;
> >> >       gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
> >> >-      u64 steal;
> >> >+      u64 steal, suspend_ns;
> >> >       u32 version;
> >> >
> >> >       if (kvm_xen_msr_enabled(vcpu->kvm)) {
> >> >@@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >> >       steal += current->sched_info.run_delay -
> >> >               vcpu->arch.st.last_steal;
> >> >       vcpu->arch.st.last_steal = current->sched_info.run_delay;
> >> >+      /*
> >> >+       * Include the time that the host was suspended in steal time.
> >> >+       * Note that the case of a suspend happening during a VM migration
> >> >+       * might not be accounted.
> >> >+       */
> >> >+      suspend_ns = kvm_total_suspend_ns();
> >> >+      steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
> >> >+      vcpu->arch.st.last_suspend_ns = suspend_ns;
> >>
> >> The document in patch 3 states:
> >>
> >>   Time during which the vcpu is idle, will not be reported as steal time
> >>
> >> I'm wondering if all host suspend time should be reported as steal time,
> >> or if the suspend time during a vCPU halt should be excluded.
> >
> >I think the statement about idle time not being reported as steal isn't
> >completely accurate, so I'm not sure if it's worth the extra complexity.
> >
> >>
> >> >       unsafe_put_user(steal, &st->steal, out);
> >> >
> >> >       version += 1;
> >> >@@ -12280,6 +12288,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >> >
> >> >       vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
> >> >       vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
> >> >+      vcpu->arch.st.last_suspend_ns = kvm_total_suspend_ns();
> >>
> >> is this necessary? I doubt this because KVM doesn't capture
> >> current->sched_info.run_delay here.
> >
> >Isn't run_delay being captured by the scheduler at all time?
>
> I meant KVM doesn't do:
>
>         vcpu->arch.st.last_steal = current->sched_info.run_delay;
>
> at vCPU creation time.

I think for run_delay it's different because run_delay is a time
difference. It's something that gets added to steal, not relative
to the previous steal value.
From what I can tell, it's correct for last_steal to be initialized to 0.

>
> >
> >We need to initialize last_suspend_ns otherwise the first call to
> >record_steal_time() for a VCPU would report a wrong value if
> >the VCPU is started after the host has already had a suspend.
>
> But initializing last_suspend_ns here doesn't guarantee KVM won't report a
> "wrong" value because a suspend can happen after vCPU creation and before
> its first VM-enter.

I see what you're saying.
I'm not sure how much this matters in practice.

-- Suleiman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time.
  2024-08-20  4:35 ` [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
  2024-08-21  6:31   ` Chao Gao
@ 2024-08-28  9:56   ` Suleiman Souhlal
  1 sibling, 0 replies; 12+ messages in thread
From: Suleiman Souhlal @ 2024-08-28  9:56 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chao Gao, kvm, linux-kernel, ssouhlal

On Tue, Aug 20, 2024 at 1:38 PM Suleiman Souhlal <suleiman@google.com> wrote:
> @@ -3735,6 +3735,14 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
>         steal += current->sched_info.run_delay -
>                 vcpu->arch.st.last_steal;
>         vcpu->arch.st.last_steal = current->sched_info.run_delay;
> +       /*
> +        * Include the time that the host was suspended in steal time.
> +        * Note that the case of a suspend happening during a VM migration
> +        * might not be accounted.
> +        */
> +       suspend_ns = kvm_total_suspend_ns();
> +       steal += suspend_ns - vcpu->arch.st.last_suspend_ns;
> +       vcpu->arch.st.last_suspend_ns = suspend_ns;
>         unsafe_put_user(steal, &st->steal, out);
>
>         version += 1;

There is an issue here: We are calling a function under UACCESS, which
raises an objtool warning.
I'll be sending a v3 with that addressed (and the function return
value wrapping in patch 1/3).

-- Suleiman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time.
  2024-08-20  4:35 [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
                   ` (2 preceding siblings ...)
  2024-08-20  4:35 ` [PATCH v2 3/3] KVM: x86: Document host suspend being included " Suleiman Souhlal
@ 2024-09-25 13:54 ` Sean Christopherson
  3 siblings, 0 replies; 12+ messages in thread
From: Sean Christopherson @ 2024-09-25 13:54 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Chao Gao, kvm, linux-kernel,
	ssouhlal, David Woodhouse

+David W for his input.

On Tue, Aug 20, 2024, Suleiman Souhlal wrote:
> This series makes it so that the time that the host is suspended is
> included in guests' steal time.
> 
> When the host resumes from a suspend, the guest thinks any task
> that was running during the suspend ran for a long time, even though
> the effective run time was much shorter, which can end up having
> negative effects with scheduling. This can be particularly noticeable
> if the guest task was RT, as it can end up getting throttled for a
> long time.
> 
> To mitigate this issue, we include the time that the host was
> suspended in steal time, which lets the guest can subtract the
> duration from the tasks' runtime.
> 
> (v1 was at https://lore.kernel.org/kvm/20240710074410.770409-1-suleiman@google.com/)
> 
> v1 -> v2:
> - Accumulate suspend time at machine-independent kvm layer and track per-VCPU
>   instead of per-VM.
> - Document changes.
> 
> Suleiman Souhlal (3):
>   KVM: Introduce kvm_total_suspend_ns().
>   KVM: x86: Include host suspended time in steal time.
>   KVM: x86: Document host suspend being included in steal time.
> 
>  Documentation/virt/kvm/x86/msr.rst |  6 ++++--
>  arch/x86/include/asm/kvm_host.h    |  1 +
>  arch/x86/kvm/x86.c                 | 11 ++++++++++-
>  include/linux/kvm_host.h           |  2 ++
>  virt/kvm/kvm_main.c                | 13 +++++++++++++
>  5 files changed, 30 insertions(+), 3 deletions(-)
> 
> -- 
> 2.46.0.184.g6999bdac58-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-09-25 13:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-20  4:35 [PATCH v2 0/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
2024-08-20  4:35 ` [PATCH v2 1/3] KVM: Introduce kvm_total_suspend_ns() Suleiman Souhlal
2024-08-21  5:40   ` Chao Gao
2024-08-21  6:01     ` Suleiman Souhlal
2024-08-20  4:35 ` [PATCH v2 2/3] KVM: x86: Include host suspended time in steal time Suleiman Souhlal
2024-08-21  6:31   ` Chao Gao
2024-08-23  4:17     ` Suleiman Souhlal
2024-08-23  5:25       ` Chao Gao
2024-08-23  5:43         ` Suleiman Souhlal
2024-08-28  9:56   ` Suleiman Souhlal
2024-08-20  4:35 ` [PATCH v2 3/3] KVM: x86: Document host suspend being included " Suleiman Souhlal
2024-09-25 13:54 ` [PATCH v2 0/3] KVM: x86: Include host suspended time " Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox