* [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
@ 2016-08-10 11:14 Mike Galbraith
2016-08-10 12:30 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Mike Galbraith @ 2016-08-10 11:14 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: LKML
Hi Peter,
While running ltp, the fates decided it was time for me to encounter
the roughly 1 out of every 10 call failure below. As much as I run
ltp, I'm a bit surprised that I (or anyone else) haven't met this
before, but then the fates are known to be a tad fickle.
getrusage04 0 TINFO : Expected timers granularity is 4000 us
getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
getrusage04 0 TINFO : utime: 0us; stime: 179us
getrusage04 0 TINFO : utime: 3751us; stime: 0us
getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
When applying the full rtime to either stime or utime, do not overwrite
the previously tallied value.
Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: stable@vger.kernel.org # 4.3+
---
kernel/sched/cputime.c | 2 ++
1 file changed, 2 insertions(+)
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
if (utime == 0) {
stime = rtime;
+ utime = prev->utime;
goto update;
}
if (stime == 0) {
utime = rtime;
+ stime = prev->stime;
goto update;
}
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-10 11:14 [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression Mike Galbraith @ 2016-08-10 12:30 ` Peter Zijlstra 2016-08-10 12:47 ` Peter Zijlstra 2016-08-10 18:57 ` Mike Galbraith 0 siblings, 2 replies; 7+ messages in thread From: Peter Zijlstra @ 2016-08-10 12:30 UTC (permalink / raw) To: Mike Galbraith; +Cc: LKML On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote: > Hi Peter, > > While running ltp, the fates decided it was time for me to encounter > the roughly 1 out of every 10 call failure below. As much as I run > ltp, I'm a bit surprised that I (or anyone else) haven't met this > before, but then the fates are known to be a tad fickle. > > getrusage04 0 TINFO : Expected timers granularity is 4000 us > getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)! > getrusage04 0 TINFO : utime: 0us; stime: 179us > getrusage04 0 TINFO : utime: 3751us; stime: 0us > getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us: > > When applying the full rtime to either stime or utime, do not overwrite > the previously tallied value. > > Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime") > Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> > Cc: stable@vger.kernel.org # 4.3+ > --- > kernel/sched/cputime.c | 2 ++ > 1 file changed, 2 insertions(+) > > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c > > if (utime == 0) { > stime = rtime; > + utime = prev->utime; > goto update; > } > > if (stime == 0) { > utime = rtime; > + stime = prev->stime; > goto update; > } This cannot be right; it violates that utime+stime==rtime. Let me try and figure out what actually happens. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-10 12:30 ` Peter Zijlstra @ 2016-08-10 12:47 ` Peter Zijlstra 2016-08-10 14:21 ` Mike Galbraith 2016-08-10 18:57 ` Mike Galbraith 1 sibling, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2016-08-10 12:47 UTC (permalink / raw) To: Mike Galbraith; +Cc: LKML On Wed, Aug 10, 2016 at 02:30:33PM +0200, Peter Zijlstra wrote: > On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote: > > Hi Peter, > > > > While running ltp, the fates decided it was time for me to encounter > > the roughly 1 out of every 10 call failure below. As much as I run > > ltp, I'm a bit surprised that I (or anyone else) haven't met this > > before, but then the fates are known to be a tad fickle. > > > > getrusage04 0 TINFO : Expected timers granularity is 4000 us > > getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)! > > getrusage04 0 TINFO : utime: 0us; stime: 179us > > getrusage04 0 TINFO : utime: 3751us; stime: 0us > > getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us: > > > > When applying the full rtime to either stime or utime, do not overwrite > > the previously tallied value. > > > > Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime") > > Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> > > Cc: stable@vger.kernel.org # 4.3+ > > --- > > kernel/sched/cputime.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c > > > > if (utime == 0) { > > stime = rtime; > > + utime = prev->utime; > > goto update; > > } > > > > if (stime == 0) { > > utime = rtime; > > + stime = prev->stime; > > goto update; > > } > > This cannot be right; it violates that utime+stime==rtime. Let me try > and figure out what actually happens. Any idea where your [us]time are coming from? Do you end up in the vtime_accounting_enabled() path or not? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-10 12:47 ` Peter Zijlstra @ 2016-08-10 14:21 ` Mike Galbraith 0 siblings, 0 replies; 7+ messages in thread From: Mike Galbraith @ 2016-08-10 14:21 UTC (permalink / raw) To: Peter Zijlstra; +Cc: LKML On Wed, 2016-08-10 at 14:47 +0200, Peter Zijlstra wrote: > Any idea where your [us]time are coming from? Do you end up in the > vtime_accounting_enabled() path or not? No, I'm not booting with nohz_full=. -Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-10 12:30 ` Peter Zijlstra 2016-08-10 12:47 ` Peter Zijlstra @ 2016-08-10 18:57 ` Mike Galbraith 2016-08-15 8:51 ` Peter Zijlstra 1 sibling, 1 reply; 7+ messages in thread From: Mike Galbraith @ 2016-08-10 18:57 UTC (permalink / raw) To: Peter Zijlstra; +Cc: LKML On Wed, 2016-08-10 at 14:30 +0200, Peter Zijlstra wrote: > On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote: > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c > > > > if (utime == 0) { > > stime = rtime; > > + utime = prev->utime; > > goto update; > > } > > > > if (stime == 0) { > > utime = rtime; > > + stime = prev->stime; > > goto update; > > } > > This cannot be right; it violates that utime+stime==rtime. Let me try > and figure out what actually happens. How about this instead. sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression Roughly 10% of the time, ltp testcase getrusage04 fails: getrusage04 0 TINFO : Expected timers granularity is 4000 us getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)! getrusage04 0 TINFO : utime: 0us; stime: 179us getrusage04 0 TINFO : utime: 3751us; stime: 0us getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us: If ->sum_exec_runtime has moved beyond the rtime of ->prev_cputime, but no time has as yet been accounted to the task, bail. Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime") Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: stable@vger.kernel.org # 4.3+ --- kernel/sched/cputime.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -606,6 +606,13 @@ static void cputime_adjust(struct task_c stime = curr->stime; utime = curr->utime; + /* + * sum_exec_runtime has moved, but nothing has yet been + * accounted to the task, there's nothing to update. + */ + if (utime + stime == 0) + goto out; + if (utime == 0) { stime = rtime; goto update; ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-10 18:57 ` Mike Galbraith @ 2016-08-15 8:51 ` Peter Zijlstra 2016-08-15 12:29 ` Mike Galbraith 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2016-08-15 8:51 UTC (permalink / raw) To: Mike Galbraith; +Cc: LKML On Wed, Aug 10, 2016 at 08:57:28PM +0200, Mike Galbraith wrote: > sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression > > Roughly 10% of the time, ltp testcase getrusage04 fails: > getrusage04 0 TINFO : Expected timers granularity is 4000 us > getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)! > getrusage04 0 TINFO : utime: 0us; stime: 179us > getrusage04 0 TINFO : utime: 3751us; stime: 0us > getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us: > > If ->sum_exec_runtime has moved beyond the rtime of ->prev_cputime, but > no time has as yet been accounted to the task, bail. > > Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime") > Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> > Cc: stable@vger.kernel.org # 4.3+ > --- > kernel/sched/cputime.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -606,6 +606,13 @@ static void cputime_adjust(struct task_c > stime = curr->stime; > utime = curr->utime; > > + /* > + * sum_exec_runtime has moved, but nothing has yet been > + * accounted to the task, there's nothing to update. > + */ > + if (utime + stime == 0) > + goto out; urgh... Valid scenario.. not sure about the solution though. This would mean the task has _no_ running time if it forever dodges the tick, which would be bad. Does something like so cure things too? --- kernel/sched/cputime.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 9858266fb0b3..2ee83b200504 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -614,19 +614,25 @@ static void cputime_adjust(struct task_cputime *curr, stime = curr->stime; utime = curr->utime; - if (utime == 0) { - stime = rtime; + /* + * If either stime or both stime and utime are 0, assume all runtime is + * userspace. Once a task gets some ticks, the monotonicy code at + * 'update' will ensure things converge to the observed ratio. + */ + if (stime == 0) { + utime = rtime; goto update; } - if (stime == 0) { - utime = rtime; + if (utime == 0) { + stime = rtime; goto update; } stime = scale_stime((__force u64)stime, (__force u64)rtime, (__force u64)(stime + utime)); +update: /* * Make sure stime doesn't go backwards; this preserves monotonicity * for utime because rtime is monotonic. @@ -649,7 +655,6 @@ static void cputime_adjust(struct task_cputime *curr, stime = rtime - utime; } -update: prev->stime = stime; prev->utime = utime; out: ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression 2016-08-15 8:51 ` Peter Zijlstra @ 2016-08-15 12:29 ` Mike Galbraith 0 siblings, 0 replies; 7+ messages in thread From: Mike Galbraith @ 2016-08-15 12:29 UTC (permalink / raw) To: Peter Zijlstra; +Cc: LKML On Mon, 2016-08-15 at 10:51 +0200, Peter Zijlstra wrote: > On Wed, Aug 10, 2016 at 08:57:28PM +0200, Mike Galbraith wrote: > > > > +> > > > /* > > +> > > > * sum_exec_runtime has moved, but nothing has yet been > > +> > > > * accounted to the task, there's nothing to update. > > +> > > > */ > > +> > > > if (utime + stime == 0) > > +> > > > > > goto out; > > urgh... > > Valid scenario.. not sure about the solution though. This would mean the > task has _no_ running time if it forever dodges the tick, which would be > bad. > > Does something like so cure things too? Yeah, it's a happy camper. > --- > kernel/sched/cputime.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > index 9858266fb0b3..2ee83b200504 100644 > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -614,19 +614,25 @@ static void cputime_adjust(struct task_cputime *curr, > > > stime = curr->stime; > > > utime = curr->utime; > > -> > if (utime == 0) { > -> > > stime = rtime; > +> > /* > +> > * If either stime or both stime and utime are 0, assume all runtime is > +> > * userspace. Once a task gets some ticks, the monotonicy code at > +> > * 'update' will ensure things converge to the observed ratio. > +> > */ > +> > if (stime == 0) { > +> > > utime = rtime; > > > > goto update; > > > } > > -> > if (stime == 0) { > -> > > utime = rtime; > +> > if (utime == 0) { > +> > > stime = rtime; > > > > goto update; > > > } > > > > stime = scale_stime((__force u64)stime, (__force u64)rtime, > > > > > (__force u64)(stime + utime)); > > +update: > > > /* > > > * Make sure stime doesn't go backwards; this preserves monotonicity > > > * for utime because rtime is monotonic. > @@ -649,7 +655,6 @@ static void cputime_adjust(struct task_cputime *curr, > > > > stime = rtime - utime; > > > } > > -update: > > > prev->stime = stime; > > > prev->utime = utime; > out: ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-08-15 12:29 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-10 11:14 [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression Mike Galbraith 2016-08-10 12:30 ` Peter Zijlstra 2016-08-10 12:47 ` Peter Zijlstra 2016-08-10 14:21 ` Mike Galbraith 2016-08-10 18:57 ` Mike Galbraith 2016-08-15 8:51 ` Peter Zijlstra 2016-08-15 12:29 ` Mike Galbraith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox