* [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug
2016-06-13 10:32 [PATCH v6 0/3] Sched, KVM: st: Add steal time support to full dynticks CPU time accounting Wanpeng Li
@ 2016-06-13 10:32 ` Wanpeng Li
2016-06-13 10:44 ` Paolo Bonzini
2016-06-13 10:32 ` [PATCH v6 2/3] sched/cputime: Fix prev steal time accouting during " Wanpeng Li
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2016-06-13 10:32 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Wanpeng Li, Paolo Bonzini, Radim Krčmář,
Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, John Stultz
From: Wanpeng Li <wanpeng.li@hotmail.com>
Sometimes, after CPU hotplug you can observe a spike in stolen time
(100%) followed by the CPU being marked as 100% idle when it's actually
busy with a CPU hog task. The trace looks like the following:
cpuhp/1-12 [001] d.h1 167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
cpuhp/1-12 [001] d.h1 167.461659: account_process_tick: steal_jiffies = 1291
<idle>-0 [001] d.h1 167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
<idle>-0 [001] d.h1 167.462664: account_process_tick: steal_jiffies = 18446744072437
The sudden decrease of "steal" causes steal_jiffies to underflow.
The root cause is kvm_steal_time being reset to 0 after hot-plugging
back in a CPU. Instead, the preexisting value can be used, which is
what the core scheduler code expects.
John Stultz also reported a similar issue after guest S3.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
arch/x86/kernel/kvm.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eea2a6f..1ef5e48 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
if (!has_steal_clock)
return;
- memset(st, 0, sizeof(*st));
-
wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
pr_info("kvm-stealtime: cpu %d, msr %llx\n",
cpu, (unsigned long long) slow_virt_to_phys(st));
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug
2016-06-13 10:32 ` [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug Wanpeng Li
@ 2016-06-13 10:44 ` Paolo Bonzini
2016-06-13 11:28 ` Peter Zijlstra
2016-06-13 11:31 ` Wanpeng Li
0 siblings, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2016-06-13 10:44 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Wanpeng Li, Radim Krčmář, Ingo Molnar,
Peter Zijlstra (Intel), Rik van Riel, Thomas Gleixner,
Frederic Weisbecker, John Stultz
On 13/06/2016 12:32, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> Sometimes, after CPU hotplug you can observe a spike in stolen time
> (100%) followed by the CPU being marked as 100% idle when it's actually
> busy with a CPU hog task. The trace looks like the following:
>
> cpuhp/1-12 [001] d.h1 167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
> cpuhp/1-12 [001] d.h1 167.461659: account_process_tick: steal_jiffies = 1291
> <idle>-0 [001] d.h1 167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
> <idle>-0 [001] d.h1 167.462664: account_process_tick: steal_jiffies = 18446744072437
>
> The sudden decrease of "steal" causes steal_jiffies to underflow.
> The root cause is kvm_steal_time being reset to 0 after hot-plugging
> back in a CPU. Instead, the preexisting value can be used, which is
> what the core scheduler code expects.
>
> John Stultz also reported a similar issue after guest S3.
>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> arch/x86/kernel/kvm.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index eea2a6f..1ef5e48 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
> if (!has_steal_clock)
> return;
>
> - memset(st, 0, sizeof(*st));
> -
> wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
> pr_info("kvm-stealtime: cpu %d, msr %llx\n",
> cpu, (unsigned long long) slow_virt_to_phys(st));
>
Because there's no cover letter, I guess I have to ack each patch
independently.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Also, there's really no relation between patches 1-2 and 3...
Paolo
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug
2016-06-13 10:44 ` Paolo Bonzini
@ 2016-06-13 11:28 ` Peter Zijlstra
2016-06-13 11:31 ` Wanpeng Li
1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-06-13 11:28 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
Radim Krčmář, Ingo Molnar, Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, John Stultz
On Mon, Jun 13, 2016 at 12:44:46PM +0200, Paolo Bonzini wrote:
> Because there's no cover letter, I guess I have to ack each patch
> independently.
>
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Thanks, I'll take the lot through the sched tree.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug
2016-06-13 10:44 ` Paolo Bonzini
2016-06-13 11:28 ` Peter Zijlstra
@ 2016-06-13 11:31 ` Wanpeng Li
1 sibling, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2016-06-13 11:31 UTC (permalink / raw)
To: Paolo Bonzini
Cc: linux-kernel@vger.kernel.org, kvm, Wanpeng Li,
Radim Krčmář, Ingo Molnar, Peter Zijlstra (Intel),
Rik van Riel, Thomas Gleixner, Frederic Weisbecker, John Stultz
2016-06-13 18:44 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 13/06/2016 12:32, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> Sometimes, after CPU hotplug you can observe a spike in stolen time
>> (100%) followed by the CPU being marked as 100% idle when it's actually
>> busy with a CPU hog task. The trace looks like the following:
>>
>> cpuhp/1-12 [001] d.h1 167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
>> cpuhp/1-12 [001] d.h1 167.461659: account_process_tick: steal_jiffies = 1291
>> <idle>-0 [001] d.h1 167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
>> <idle>-0 [001] d.h1 167.462664: account_process_tick: steal_jiffies = 18446744072437
>>
>> The sudden decrease of "steal" causes steal_jiffies to underflow.
>> The root cause is kvm_steal_time being reset to 0 after hot-plugging
>> back in a CPU. Instead, the preexisting value can be used, which is
>> what the core scheduler code expects.
>>
>> John Stultz also reported a similar issue after guest S3.
>>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: John Stultz <john.stultz@linaro.org>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>> arch/x86/kernel/kvm.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index eea2a6f..1ef5e48 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
>> if (!has_steal_clock)
>> return;
>>
>> - memset(st, 0, sizeof(*st));
>> -
>> wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>> pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>> cpu, (unsigned long long) slow_virt_to_phys(st));
>>
>
> Because there's no cover letter, I guess I have to ack each patch
> independently.
>
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Thanks for your and Rik's review, actually there is a cover letter for
this version, it seems that it just send to ML and forgot to Cc
maintainers/reviewers.
Regards,
Wanpeng Li
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 2/3] sched/cputime: Fix prev steal time accouting during cpu hotplug
2016-06-13 10:32 [PATCH v6 0/3] Sched, KVM: st: Add steal time support to full dynticks CPU time accounting Wanpeng Li
2016-06-13 10:32 ` [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug Wanpeng Li
@ 2016-06-13 10:32 ` Wanpeng Li
2016-06-13 10:44 ` Paolo Bonzini
2016-06-13 10:32 ` [PATCH v6 3/3] sched/cputime: Add steal time support to full dynticks CPU time accounting Wanpeng Li
2016-06-13 11:28 ` [PATCH v6 0/3] Sched, KVM: st: " Wanpeng Li
3 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2016-06-13 10:32 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, Paolo Bonzini,
Radim Krčmář
From: Wanpeng Li <wanpeng.li@hotmail.com>
Commit e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU
hotplug") set rq->prev_* to 0 after a cpu hotplug comes back in order to
fix the case where (after CPU hotplug) steal is smaller than
rq->prev_steal_time.
However, this should never happen. steal was only smaller because of the
KVM-specific bug fixed by the previous patch. Worse, the previous patch
triggers a bug on CPU hot-unplug/plug operation: because
rq->prev_steal_time is cleared, all of the CPU's past steal time will be
accounted again on hot-plug.
Since the root cause has been fixed, we can just revert commit e9532e69b8d1.
Fixes: 'commit e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")'
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
kernel/sched/core.c | 1 -
kernel/sched/sched.h | 13 -------------
2 files changed, 14 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7f2cae4..7d45bb3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7213,7 +7213,6 @@ static void sched_rq_cpu_starting(unsigned int cpu)
struct rq *rq = cpu_rq(cpu);
rq->calc_load_update = calc_load_update;
- account_reset_rq(rq);
update_max_interval();
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 72f1f30..de607e4 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1809,16 +1809,3 @@ static inline void cpufreq_trigger_update(u64 time) {}
#else /* arch_scale_freq_capacity */
#define arch_scale_freq_invariant() (false)
#endif
-
-static inline void account_reset_rq(struct rq *rq)
-{
-#ifdef CONFIG_IRQ_TIME_ACCOUNTING
- rq->prev_irq_time = 0;
-#endif
-#ifdef CONFIG_PARAVIRT
- rq->prev_steal_time = 0;
-#endif
-#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
- rq->prev_steal_time_rq = 0;
-#endif
-}
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v6 2/3] sched/cputime: Fix prev steal time accouting during cpu hotplug
2016-06-13 10:32 ` [PATCH v6 2/3] sched/cputime: Fix prev steal time accouting during " Wanpeng Li
@ 2016-06-13 10:44 ` Paolo Bonzini
0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2016-06-13 10:44 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, Radim Krčmář
On 13/06/2016 12:32, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> Commit e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU
> hotplug") set rq->prev_* to 0 after a cpu hotplug comes back in order to
> fix the case where (after CPU hotplug) steal is smaller than
> rq->prev_steal_time.
>
> However, this should never happen. steal was only smaller because of the
> KVM-specific bug fixed by the previous patch. Worse, the previous patch
> triggers a bug on CPU hot-unplug/plug operation: because
> rq->prev_steal_time is cleared, all of the CPU's past steal time will be
> accounted again on hot-plug.
>
> Since the root cause has been fixed, we can just revert commit e9532e69b8d1.
>
> Fixes: 'commit e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")'
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> kernel/sched/core.c | 1 -
> kernel/sched/sched.h | 13 -------------
> 2 files changed, 14 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7f2cae4..7d45bb3 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7213,7 +7213,6 @@ static void sched_rq_cpu_starting(unsigned int cpu)
> struct rq *rq = cpu_rq(cpu);
>
> rq->calc_load_update = calc_load_update;
> - account_reset_rq(rq);
> update_max_interval();
> }
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 72f1f30..de607e4 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1809,16 +1809,3 @@ static inline void cpufreq_trigger_update(u64 time) {}
> #else /* arch_scale_freq_capacity */
> #define arch_scale_freq_invariant() (false)
> #endif
> -
> -static inline void account_reset_rq(struct rq *rq)
> -{
> -#ifdef CONFIG_IRQ_TIME_ACCOUNTING
> - rq->prev_irq_time = 0;
> -#endif
> -#ifdef CONFIG_PARAVIRT
> - rq->prev_steal_time = 0;
> -#endif
> -#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
> - rq->prev_steal_time_rq = 0;
> -#endif
> -}
>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v6 3/3] sched/cputime: Add steal time support to full dynticks CPU time accounting
2016-06-13 10:32 [PATCH v6 0/3] Sched, KVM: st: Add steal time support to full dynticks CPU time accounting Wanpeng Li
2016-06-13 10:32 ` [PATCH v6 1/3] KVM: fix steal clock warp during guest cpu hotplug Wanpeng Li
2016-06-13 10:32 ` [PATCH v6 2/3] sched/cputime: Fix prev steal time accouting during " Wanpeng Li
@ 2016-06-13 10:32 ` Wanpeng Li
2016-06-13 10:44 ` Paolo Bonzini
2016-06-13 11:28 ` [PATCH v6 0/3] Sched, KVM: st: " Wanpeng Li
3 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2016-06-13 10:32 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, Paolo Bonzini,
Radim Krčmář
From: Wanpeng Li <wanpeng.li@hotmail.com>
This patch adds guest steal-time support to full dynticks CPU
time accounting. After the following commit:
ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")
... time sampling became jiffy based, even if it's still listening to ring
boundaries, so steal_account_process_tick() is reused to account how many
'ticks' are stolen-time, after the last accumulation.
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
kernel/sched/cputime.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 75f98c5..3d60e5d 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -257,7 +257,7 @@ void account_idle_time(cputime_t cputime)
cpustat[CPUTIME_IDLE] += (__force u64) cputime;
}
-static __always_inline bool steal_account_process_tick(void)
+static __always_inline unsigned long steal_account_process_tick(unsigned long max_jiffies)
{
#ifdef CONFIG_PARAVIRT
if (static_key_false(¶virt_steal_enabled)) {
@@ -272,14 +272,14 @@ static __always_inline bool steal_account_process_tick(void)
* time in jiffies. Lets cast the result to jiffies
* granularity and account the rest on the next rounds.
*/
- steal_jiffies = nsecs_to_jiffies(steal);
+ steal_jiffies = min(nsecs_to_jiffies(steal), max_jiffies);
this_rq()->prev_steal_time += jiffies_to_nsecs(steal_jiffies);
account_steal_time(jiffies_to_cputime(steal_jiffies));
return steal_jiffies;
}
#endif
- return false;
+ return 0;
}
/*
@@ -346,7 +346,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
u64 cputime = (__force u64) cputime_one_jiffy;
u64 *cpustat = kcpustat_this_cpu->cpustat;
- if (steal_account_process_tick())
+ if (steal_account_process_tick(ULONG_MAX))
return;
cputime *= ticks;
@@ -477,7 +477,7 @@ void account_process_tick(struct task_struct *p, int user_tick)
return;
}
- if (steal_account_process_tick())
+ if (steal_account_process_tick(ULONG_MAX))
return;
if (user_tick)
@@ -681,12 +681,14 @@ static cputime_t vtime_delta(struct task_struct *tsk)
static cputime_t get_vtime_delta(struct task_struct *tsk)
{
unsigned long now = READ_ONCE(jiffies);
- unsigned long delta = now - tsk->vtime_snap;
+ unsigned long delta_jiffies, steal_jiffies;
+ delta_jiffies = now - tsk->vtime_snap;
+ steal_jiffies = steal_account_process_tick(delta_jiffies);
WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
tsk->vtime_snap = now;
- return jiffies_to_cputime(delta);
+ return jiffies_to_cputime(delta_jiffies - steal_jiffies);
}
static void __vtime_account_system(struct task_struct *tsk)
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v6 3/3] sched/cputime: Add steal time support to full dynticks CPU time accounting
2016-06-13 10:32 ` [PATCH v6 3/3] sched/cputime: Add steal time support to full dynticks CPU time accounting Wanpeng Li
@ 2016-06-13 10:44 ` Paolo Bonzini
0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2016-06-13 10:44 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm
Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, Radim Krčmář
On 13/06/2016 12:32, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> This patch adds guest steal-time support to full dynticks CPU
> time accounting. After the following commit:
>
> ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")
>
> ... time sampling became jiffy based, even if it's still listening to ring
> boundaries, so steal_account_process_tick() is reused to account how many
> 'ticks' are stolen-time, after the last accumulation.
>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> kernel/sched/cputime.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 75f98c5..3d60e5d 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -257,7 +257,7 @@ void account_idle_time(cputime_t cputime)
> cpustat[CPUTIME_IDLE] += (__force u64) cputime;
> }
>
> -static __always_inline bool steal_account_process_tick(void)
> +static __always_inline unsigned long steal_account_process_tick(unsigned long max_jiffies)
> {
> #ifdef CONFIG_PARAVIRT
> if (static_key_false(¶virt_steal_enabled)) {
> @@ -272,14 +272,14 @@ static __always_inline bool steal_account_process_tick(void)
> * time in jiffies. Lets cast the result to jiffies
> * granularity and account the rest on the next rounds.
> */
> - steal_jiffies = nsecs_to_jiffies(steal);
> + steal_jiffies = min(nsecs_to_jiffies(steal), max_jiffies);
> this_rq()->prev_steal_time += jiffies_to_nsecs(steal_jiffies);
>
> account_steal_time(jiffies_to_cputime(steal_jiffies));
> return steal_jiffies;
> }
> #endif
> - return false;
> + return 0;
> }
>
> /*
> @@ -346,7 +346,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> u64 cputime = (__force u64) cputime_one_jiffy;
> u64 *cpustat = kcpustat_this_cpu->cpustat;
>
> - if (steal_account_process_tick())
> + if (steal_account_process_tick(ULONG_MAX))
> return;
>
> cputime *= ticks;
> @@ -477,7 +477,7 @@ void account_process_tick(struct task_struct *p, int user_tick)
> return;
> }
>
> - if (steal_account_process_tick())
> + if (steal_account_process_tick(ULONG_MAX))
> return;
>
> if (user_tick)
> @@ -681,12 +681,14 @@ static cputime_t vtime_delta(struct task_struct *tsk)
> static cputime_t get_vtime_delta(struct task_struct *tsk)
> {
> unsigned long now = READ_ONCE(jiffies);
> - unsigned long delta = now - tsk->vtime_snap;
> + unsigned long delta_jiffies, steal_jiffies;
>
> + delta_jiffies = now - tsk->vtime_snap;
> + steal_jiffies = steal_account_process_tick(delta_jiffies);
> WARN_ON_ONCE(tsk->vtime_snap_whence == VTIME_INACTIVE);
> tsk->vtime_snap = now;
>
> - return jiffies_to_cputime(delta);
> + return jiffies_to_cputime(delta_jiffies - steal_jiffies);
> }
>
> static void __vtime_account_system(struct task_struct *tsk)
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v6 0/3] Sched, KVM: st: Add steal time support to full dynticks CPU time accounting
2016-06-13 10:32 [PATCH v6 0/3] Sched, KVM: st: Add steal time support to full dynticks CPU time accounting Wanpeng Li
` (2 preceding siblings ...)
2016-06-13 10:32 ` [PATCH v6 3/3] sched/cputime: Add steal time support to full dynticks CPU time accounting Wanpeng Li
@ 2016-06-13 11:28 ` Wanpeng Li
3 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2016-06-13 11:28 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, kvm
Cc: Ingo Molnar, Peter Zijlstra (Intel), Rik van Riel,
Thomas Gleixner, Frederic Weisbecker, Paolo Bonzini,
Radim Krčmář, Wanpeng Li, John Stultz
Cc maintainers/reviewers,
2016-06-13 18:32 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> Periodic/NOHZ idle which don't use vtime have logic account steal time,
> however, vtime(depends on context tracking) which is just used in full
> dynticks doesn't account steal time, this patchset adds the steal time
> acccount support in vtime which will be used in full dynticks guest.
>
> Patch 1 and patch 2 fix steal clock warp and prev steal time account
> during cpu hotplug bugs.
> Patch 3 adds the steal time support to full dynticks CPU time accounting.
>
> N.B. This version of patchset drops previous Acked-by and Reviewed-by since
> they are different from earlier version. :)
>
> v5 -> v6:
> * improve commit message of patch 2/3, 3/3
> * fix account st twice
> v4 -> v5:
> * improve commit message of patch 1/3
> * revert commit e9532e69b8d1
> * apply same logic to account_idle_time, so change get_vtime_delta instead
> v3 -> v4:
> * fix grammar errors, thanks Ingo
> * cleanup fragile codes, thanks Ingo
> v2 -> v3:
> * fix the root cause
> * convert steal time jiffies to cputime
> v1 -> v2:
> * update patch subject, description and comments
> * deal with the case where steal time suddenly increases by a ludicrous amount
> * fix divide zero bug, thanks Rik
>
> Wanpeng Li (3):
> KVM: fix steal clock warp during guest cpu hotplug
> sched/cputime: Fix prev steal time accouting during cpu hotplug
> sched/cputime: Add steal time support to full dynticks CPU time
> accounting
>
> arch/x86/kernel/kvm.c | 2 --
> kernel/sched/core.c | 1 -
> kernel/sched/cputime.c | 16 +++++++++-------
> kernel/sched/sched.h | 13 -------------
> 4 files changed, 9 insertions(+), 23 deletions(-)
>
> --
> 1.9.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread