* Xen: decreasing cpu steal clock counter @ 2017-08-31 10:44 Valentin Vidic 2017-08-31 13:37 ` Greg KH 0 siblings, 1 reply; 5+ messages in thread From: Valentin Vidic @ 2017-08-31 10:44 UTC (permalink / raw) To: stable; +Cc: Michael Lass The following behavior of the steal counter observed in a Xen guest running 4.9 kernel: $ while sleep 1; do head -1 /proc/stat ; done cpu 1556 0 1429 314195002 5529 0 64 14370419283 0 0 cpu 1556 0 1429 314195402 5529 0 64 3601506907 0 0 cpu 1556 0 1429 314195802 5529 0 64 1833790429262 0 0 cpu 1556 0 1429 314196203 5529 0 64 1821957766874 0 0 cpu 1556 0 1429 314196603 5529 0 64 1810766851628 0 0 cpu 1556 0 1429 314197002 5529 0 64 1792853828090 0 0 Could this patch or some variation of it be included in the 4.9 LTS? https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5 Problem exists in versions 4.8 until 4.11 so older LTS kernels should not be affected. More details on this problem here: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest/ -- Valentin ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter 2017-08-31 10:44 Xen: decreasing cpu steal clock counter Valentin Vidic @ 2017-08-31 13:37 ` Greg KH 2017-08-31 13:51 ` Valentin Vidic 0 siblings, 1 reply; 5+ messages in thread From: Greg KH @ 2017-08-31 13:37 UTC (permalink / raw) To: Valentin Vidic; +Cc: stable, Michael Lass On Thu, Aug 31, 2017 at 12:44:51PM +0200, Valentin Vidic wrote: > The following behavior of the steal counter observed > in a Xen guest running 4.9 kernel: > > $ while sleep 1; do head -1 /proc/stat ; done > cpu 1556 0 1429 314195002 5529 0 64 14370419283 0 0 > cpu 1556 0 1429 314195402 5529 0 64 3601506907 0 0 > cpu 1556 0 1429 314195802 5529 0 64 1833790429262 0 0 > cpu 1556 0 1429 314196203 5529 0 64 1821957766874 0 0 > cpu 1556 0 1429 314196603 5529 0 64 1810766851628 0 0 > cpu 1556 0 1429 314197002 5529 0 64 1792853828090 0 0 > > Could this patch or some variation of it be included in > the 4.9 LTS? What patch? > https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5 > > Problem exists in versions 4.8 until 4.11 so older LTS > kernels should not be affected. > > More details on this problem here: > > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest/ What is the git commit id of the aptch in Linus's tree that resolves this issue? thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter 2017-08-31 13:37 ` Greg KH @ 2017-08-31 13:51 ` Valentin Vidic 2017-08-31 14:08 ` Greg KH 0 siblings, 1 reply; 5+ messages in thread From: Valentin Vidic @ 2017-08-31 13:51 UTC (permalink / raw) To: Greg KH; +Cc: stable, Michael Lass [-- Attachment #1: Type: text/plain, Size: 544 bytes --] On Thu, Aug 31, 2017 at 03:37:09PM +0200, Greg KH wrote: > What patch? Attaching the patch from this link: https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5 > What is the git commit id of the aptch in Linus's tree that resolves > this issue? The issue was fixed later in 4.11, but this might be a bigger change so not sure if you want to take that: 2b1f967d80e8e5d7361f0e1654c842869570f573 sched/cputime: Complete nsec conversion of tick based accounting -- Valentin [-- Attachment #2: handle-decreasing-steal-clock.patch --] [-- Type: text/x-diff, Size: 2959 bytes --] >From 4b66621a06a94d22629661a9262f92b8cf5b7ca9 Mon Sep 17 00:00:00 2001 From: Michael Lass <bevan@bi-co.net> Date: Sun, 6 Aug 2017 18:09:21 +0200 Subject: [PATCH] sched/cputime: handle decreasing steal clock On some flaky Xen hosts, the steal clock returned by paravirt_steal_clock is not monotonically increasing but can slightly decrease. Currently this results in an overflow of u64 steal. Before giving this number to account_steal_time() it is converted into cputime, so the target cpustat counter cpustat[CPUTIME_STEAL] is not overflowing as well but instead increased by a large amount. Due to the conversion to cputime and back into nanoseconds, this_rq()->prev_steal_time does not correctly reflect the latest reported steal clock afterwards, resulting in erratic behavior such as backwards running cpustat[CPUTIME_STEAL]. The following is a trace from userspace of the value for steal time reported in /proc/stat: time stolen diff ---- ------ ---- 0ms 784 100ms 1844670130367 1844670129583 200ms 1844664564089 -5566278 300ms 1844659554439 -5009650 400ms 1844655101417 -4453022 This issue was probably introduced by the following commits, which deactivate a check for (steal < 0) in the Xen pv guest codepath and allow unlimited jumps of the cpustat counters (both introduced in v4.8): ecb23dc6f2eff0ce64dd60351a81f376f13b12cc 03cbc732639ddcad15218c4b2046d255851ff1e3 As a workaround, ignore decreasing values steal clock. By not updating this_rq()->prev_steal_time we make sure that steal time is only accuonted as soon as the steal clock raises above the value that was already observed and accounted for earlier. In current kernel versions (v4.11 and higher) this issue should not exist since conversion between nsec and cputime has been eliminated. Therefore all values will overflow, i.e. decrease as reported by the host system. --- kernel/sched/cputime.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 5ebee3164e64..5f039f7f9294 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -262,10 +262,19 @@ static __always_inline cputime_t steal_account_process_time(cputime_t maxtime) #ifdef CONFIG_PARAVIRT if (static_key_false(¶virt_steal_enabled)) { cputime_t steal_cputime; - u64 steal; - - steal = paravirt_steal_clock(smp_processor_id()); - steal -= this_rq()->prev_steal_time; + u64 steal_time; + s64 steal; + + steal_time = paravirt_steal_clock(smp_processor_id()); + steal = steal_time - this_rq()->prev_steal_time; + + if (unlikely(steal < 0)) { + printk_ratelimited(KERN_DEBUG "cputime: steal_clock for " + "processor %d decreased: %llu -> %llu, " + "ignoring\n", smp_processor_id(), + this_rq()->prev_steal_time, steal_time); + return 0; + } steal_cputime = min(nsecs_to_cputime(steal), maxtime); account_steal_time(steal_cputime); -- 2.14.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter 2017-08-31 13:51 ` Valentin Vidic @ 2017-08-31 14:08 ` Greg KH 2017-08-31 15:05 ` Valentin Vidic 0 siblings, 1 reply; 5+ messages in thread From: Greg KH @ 2017-08-31 14:08 UTC (permalink / raw) To: Valentin Vidic; +Cc: stable, Michael Lass On Thu, Aug 31, 2017 at 03:51:39PM +0200, Valentin Vidic wrote: > On Thu, Aug 31, 2017 at 03:37:09PM +0200, Greg KH wrote: > > What patch? > > Attaching the patch from this link: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5 I can't do anything with non-upstream patches for stable kernels. You have read https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html right? > > What is the git commit id of the aptch in Linus's tree that resolves > > this issue? > > The issue was fixed later in 4.11, but this might be a bigger change > so not sure if you want to take that: > > 2b1f967d80e8e5d7361f0e1654c842869570f573 > sched/cputime: Complete nsec conversion of tick based accounting I always would rather take the original change that is in Linus's tree, as 99% of the time we take something different, it ends up being wrong. But I kind of doubt the above git commit id is the right one to take :( I need some feedback from the Xen maintainers before I can do anything else. thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter 2017-08-31 14:08 ` Greg KH @ 2017-08-31 15:05 ` Valentin Vidic 0 siblings, 0 replies; 5+ messages in thread From: Valentin Vidic @ 2017-08-31 15:05 UTC (permalink / raw) To: Greg KH; +Cc: stable, Michael Lass On Thu, Aug 31, 2017 at 04:08:51PM +0200, Greg KH wrote: > I always would rather take the original change that is in Linus's tree, > as 99% of the time we take something different, it ends up being wrong. > > But I kind of doubt the above git commit id is the right one to take :( > > I need some feedback from the Xen maintainers before I can do anything > else. No problem, thanks for the reply. I will try to extract and test the changes for this problem from the later kernel version and also check with the Xen people. -- Valentin ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-08-31 15:05 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-31 10:44 Xen: decreasing cpu steal clock counter Valentin Vidic 2017-08-31 13:37 ` Greg KH 2017-08-31 13:51 ` Valentin Vidic 2017-08-31 14:08 ` Greg KH 2017-08-31 15:05 ` Valentin Vidic
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox