* [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
@ 2016-08-10 11:14 Mike Galbraith
2016-08-10 12:30 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Mike Galbraith @ 2016-08-10 11:14 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: LKML
Hi Peter,
While running ltp, the fates decided it was time for me to encounter
the roughly 1 out of every 10 call failure below. As much as I run
ltp, I'm a bit surprised that I (or anyone else) haven't met this
before, but then the fates are known to be a tad fickle.
getrusage04 0 TINFO : Expected timers granularity is 4000 us
getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
getrusage04 0 TINFO : utime: 0us; stime: 179us
getrusage04 0 TINFO : utime: 3751us; stime: 0us
getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
When applying the full rtime to either stime or utime, do not overwrite
the previously tallied value.
Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: stable@vger.kernel.org # 4.3+
---
kernel/sched/cputime.c | 2 ++
1 file changed, 2 insertions(+)
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
if (utime == 0) {
stime = rtime;
+ utime = prev->utime;
goto update;
}
if (stime == 0) {
utime = rtime;
+ stime = prev->stime;
goto update;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-10 11:14 [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression Mike Galbraith
@ 2016-08-10 12:30 ` Peter Zijlstra
2016-08-10 12:47 ` Peter Zijlstra
2016-08-10 18:57 ` Mike Galbraith
0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2016-08-10 12:30 UTC (permalink / raw)
To: Mike Galbraith; +Cc: LKML
On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote:
> Hi Peter,
>
> While running ltp, the fates decided it was time for me to encounter
> the roughly 1 out of every 10 call failure below. As much as I run
> ltp, I'm a bit surprised that I (or anyone else) haven't met this
> before, but then the fates are known to be a tad fickle.
>
> getrusage04 0 TINFO : Expected timers granularity is 4000 us
> getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
> getrusage04 0 TINFO : utime: 0us; stime: 179us
> getrusage04 0 TINFO : utime: 3751us; stime: 0us
> getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
>
> When applying the full rtime to either stime or utime, do not overwrite
> the previously tallied value.
>
> Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Cc: stable@vger.kernel.org # 4.3+
> ---
> kernel/sched/cputime.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
>
> if (utime == 0) {
> stime = rtime;
> + utime = prev->utime;
> goto update;
> }
>
> if (stime == 0) {
> utime = rtime;
> + stime = prev->stime;
> goto update;
> }
This cannot be right; it violates that utime+stime==rtime. Let me try
and figure out what actually happens.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-10 12:30 ` Peter Zijlstra
@ 2016-08-10 12:47 ` Peter Zijlstra
2016-08-10 14:21 ` Mike Galbraith
2016-08-10 18:57 ` Mike Galbraith
1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2016-08-10 12:47 UTC (permalink / raw)
To: Mike Galbraith; +Cc: LKML
On Wed, Aug 10, 2016 at 02:30:33PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote:
> > Hi Peter,
> >
> > While running ltp, the fates decided it was time for me to encounter
> > the roughly 1 out of every 10 call failure below. As much as I run
> > ltp, I'm a bit surprised that I (or anyone else) haven't met this
> > before, but then the fates are known to be a tad fickle.
> >
> > getrusage04 0 TINFO : Expected timers granularity is 4000 us
> > getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
> > getrusage04 0 TINFO : utime: 0us; stime: 179us
> > getrusage04 0 TINFO : utime: 3751us; stime: 0us
> > getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
> >
> > When applying the full rtime to either stime or utime, do not overwrite
> > the previously tallied value.
> >
> > Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
> > Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> > Cc: stable@vger.kernel.org # 4.3+
> > ---
> > kernel/sched/cputime.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
> >
> > if (utime == 0) {
> > stime = rtime;
> > + utime = prev->utime;
> > goto update;
> > }
> >
> > if (stime == 0) {
> > utime = rtime;
> > + stime = prev->stime;
> > goto update;
> > }
>
> This cannot be right; it violates that utime+stime==rtime. Let me try
> and figure out what actually happens.
Any idea where your [us]time are coming from? Do you end up in the
vtime_accounting_enabled() path or not?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-10 12:47 ` Peter Zijlstra
@ 2016-08-10 14:21 ` Mike Galbraith
0 siblings, 0 replies; 7+ messages in thread
From: Mike Galbraith @ 2016-08-10 14:21 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: LKML
On Wed, 2016-08-10 at 14:47 +0200, Peter Zijlstra wrote:
> Any idea where your [us]time are coming from? Do you end up in the
> vtime_accounting_enabled() path or not?
No, I'm not booting with nohz_full=.
-Mike
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-10 12:30 ` Peter Zijlstra
2016-08-10 12:47 ` Peter Zijlstra
@ 2016-08-10 18:57 ` Mike Galbraith
2016-08-15 8:51 ` Peter Zijlstra
1 sibling, 1 reply; 7+ messages in thread
From: Mike Galbraith @ 2016-08-10 18:57 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: LKML
On Wed, 2016-08-10 at 14:30 +0200, Peter Zijlstra wrote:
> On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote:
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
> >
> > if (utime == 0) {
> > stime = rtime;
> > + utime = prev->utime;
> > goto update;
> > }
> >
> > if (stime == 0) {
> > utime = rtime;
> > + stime = prev->stime;
> > goto update;
> > }
>
> This cannot be right; it violates that utime+stime==rtime. Let me try
> and figure out what actually happens.
How about this instead.
sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
Roughly 10% of the time, ltp testcase getrusage04 fails:
getrusage04 0 TINFO : Expected timers granularity is 4000 us
getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
getrusage04 0 TINFO : utime: 0us; stime: 179us
getrusage04 0 TINFO : utime: 3751us; stime: 0us
getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
If ->sum_exec_runtime has moved beyond the rtime of ->prev_cputime, but
no time has as yet been accounted to the task, bail.
Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: stable@vger.kernel.org # 4.3+
---
kernel/sched/cputime.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -606,6 +606,13 @@ static void cputime_adjust(struct task_c
stime = curr->stime;
utime = curr->utime;
+ /*
+ * sum_exec_runtime has moved, but nothing has yet been
+ * accounted to the task, there's nothing to update.
+ */
+ if (utime + stime == 0)
+ goto out;
+
if (utime == 0) {
stime = rtime;
goto update;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-10 18:57 ` Mike Galbraith
@ 2016-08-15 8:51 ` Peter Zijlstra
2016-08-15 12:29 ` Mike Galbraith
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2016-08-15 8:51 UTC (permalink / raw)
To: Mike Galbraith; +Cc: LKML
On Wed, Aug 10, 2016 at 08:57:28PM +0200, Mike Galbraith wrote:
> sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
>
> Roughly 10% of the time, ltp testcase getrusage04 fails:
> getrusage04 0 TINFO : Expected timers granularity is 4000 us
> getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
> getrusage04 0 TINFO : utime: 0us; stime: 179us
> getrusage04 0 TINFO : utime: 3751us; stime: 0us
> getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
>
> If ->sum_exec_runtime has moved beyond the rtime of ->prev_cputime, but
> no time has as yet been accounted to the task, bail.
>
> Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Cc: stable@vger.kernel.org # 4.3+
> ---
> kernel/sched/cputime.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -606,6 +606,13 @@ static void cputime_adjust(struct task_c
> stime = curr->stime;
> utime = curr->utime;
>
> + /*
> + * sum_exec_runtime has moved, but nothing has yet been
> + * accounted to the task, there's nothing to update.
> + */
> + if (utime + stime == 0)
> + goto out;
urgh...
Valid scenario.. not sure about the solution though. This would mean the
task has _no_ running time if it forever dodges the tick, which would be
bad.
Does something like so cure things too?
---
kernel/sched/cputime.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9858266fb0b3..2ee83b200504 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -614,19 +614,25 @@ static void cputime_adjust(struct task_cputime *curr,
stime = curr->stime;
utime = curr->utime;
- if (utime == 0) {
- stime = rtime;
+ /*
+ * If either stime or both stime and utime are 0, assume all runtime is
+ * userspace. Once a task gets some ticks, the monotonicy code at
+ * 'update' will ensure things converge to the observed ratio.
+ */
+ if (stime == 0) {
+ utime = rtime;
goto update;
}
- if (stime == 0) {
- utime = rtime;
+ if (utime == 0) {
+ stime = rtime;
goto update;
}
stime = scale_stime((__force u64)stime, (__force u64)rtime,
(__force u64)(stime + utime));
+update:
/*
* Make sure stime doesn't go backwards; this preserves monotonicity
* for utime because rtime is monotonic.
@@ -649,7 +655,6 @@ static void cputime_adjust(struct task_cputime *curr,
stime = rtime - utime;
}
-update:
prev->stime = stime;
prev->utime = utime;
out:
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-15 8:51 ` Peter Zijlstra
@ 2016-08-15 12:29 ` Mike Galbraith
0 siblings, 0 replies; 7+ messages in thread
From: Mike Galbraith @ 2016-08-15 12:29 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: LKML
On Mon, 2016-08-15 at 10:51 +0200, Peter Zijlstra wrote:
> On Wed, Aug 10, 2016 at 08:57:28PM +0200, Mike Galbraith wrote:
> >
> > +> > > > /*
> > +> > > > * sum_exec_runtime has moved, but nothing has yet been
> > +> > > > * accounted to the task, there's nothing to update.
> > +> > > > */
> > +> > > > if (utime + stime == 0)
> > +> > > > > > goto out;
>
> urgh...
>
> Valid scenario.. not sure about the solution though. This would mean the
> task has _no_ running time if it forever dodges the tick, which would be
> bad.
>
> Does something like so cure things too?
Yeah, it's a happy camper.
> ---
> kernel/sched/cputime.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 9858266fb0b3..2ee83b200504 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -614,19 +614,25 @@ static void cputime_adjust(struct task_cputime *curr,
> > > stime = curr->stime;
> > > utime = curr->utime;
>
> -> > if (utime == 0) {
> -> > > stime = rtime;
> +> > /*
> +> > * If either stime or both stime and utime are 0, assume all runtime is
> +> > * userspace. Once a task gets some ticks, the monotonicy code at
> +> > * 'update' will ensure things converge to the observed ratio.
> +> > */
> +> > if (stime == 0) {
> +> > > utime = rtime;
> > > > goto update;
> > > }
>
> -> > if (stime == 0) {
> -> > > utime = rtime;
> +> > if (utime == 0) {
> +> > > stime = rtime;
> > > > goto update;
> > > }
>
> > > stime = scale_stime((__force u64)stime, (__force u64)rtime,
> > > > > (__force u64)(stime + utime));
>
> +update:
> > > /*
> > > * Make sure stime doesn't go backwards; this preserves monotonicity
> > > * for utime because rtime is monotonic.
> @@ -649,7 +655,6 @@ static void cputime_adjust(struct task_cputime *curr,
> > > > stime = rtime - utime;
> > > }
>
> -update:
> > > prev->stime = stime;
> > > prev->utime = utime;
> out:
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-08-15 12:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-10 11:14 [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression Mike Galbraith
2016-08-10 12:30 ` Peter Zijlstra
2016-08-10 12:47 ` Peter Zijlstra
2016-08-10 14:21 ` Mike Galbraith
2016-08-10 18:57 ` Mike Galbraith
2016-08-15 8:51 ` Peter Zijlstra
2016-08-15 12:29 ` Mike Galbraith
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox