From: Frederic Weisbecker <fweisbec@gmail.com>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Borislav Petkov <bp@alien8.de>,
linux-kernel@vger.kernel.org
Subject: Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat
Date: Mon, 2 Sep 2013 15:07:45 +0200 [thread overview]
Message-ID: <20130902130744.GB2378@somewhere> (raw)
In-Reply-To: <20130902122845.GA2457@swordfish.minsk.epam.com>
On Mon, Sep 02, 2013 at 03:28:45PM +0300, Sergey Senozhatsky wrote:
> On (08/31/13 01:04), Frederic Weisbecker wrote:
> > > in cputime_adjust() `stime' is greater than `rtime', so `utime = rtime - stime'
> > > sets wrong value.
> >
> > But stime should always be below rtime due to the calculation done by scale_stime()
> > which roughly sums up to:
> >
> > stime = (stime / stime + utime) * rtime
> >
> > So this shouldn't happen.
> >
> > I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
> >
>
> Hope this may help.
> I've added a silly check to make sure that `stime < rtime'
>
> @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> if (total) {
> stime = scale_stime((__force u64)stime,
> (__force u64)rtime, (__force u64)total);
> + if (stime > rtime) {
> + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> + WARN_ON(1);
> + }
> utime = rtime - stime;
> } else {
> stime = rtime;
>
>
> and got the following results:
>
>
> [ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
> [ 1295.311063] ------------[ cut here ]------------
> [ 1295.311072] WARNING: CPU: 1 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
> [ 1295.311123] CPU: 1 PID: 2167 Comm: top Tainted: G C 3.11.0-rc7-next-20130902-dbg-dirty #2
> [ 1295.311126] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 1295.311128] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
> [ 1295.311134] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
> [ 1295.311139] 00000001138994d0 00000001138994d1 ffff880153118ff0 ffff880099937c98
> [ 1295.311145] Call Trace:
> [ 1295.311155] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
> [ 1295.311162] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
> [ 1295.311166] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
> [ 1295.311170] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
> [ 1295.311175] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
> [ 1295.311182] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
> [ 1295.311188] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
> [ 1295.311192] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
> [ 1295.311198] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
> [ 1295.311204] [<ffffffff811805e1>] vfs_read+0xa1/0x180
> [ 1295.311208] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
> [ 1295.311213] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
> [ 1295.311216] ---[ end trace bab1e899ff45eea2 ]---
>
> [ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
> [ 1301.384401] ------------[ cut here ]------------
> [ 1301.384409] WARNING: CPU: 0 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
> [ 1301.384450] CPU: 0 PID: 2167 Comm: top Tainted: G WC 3.11.0-rc7-next-20130902-dbg-dirty #2
> [ 1301.384452] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 1301.384454] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
> [ 1301.384458] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
> [ 1301.384462] 0000000115678476 0000000115678477 ffff880153118ff0 ffff880099937c98
> [ 1301.384466] Call Trace:
> [ 1301.384475] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
> [ 1301.384481] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
> [ 1301.384484] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
> [ 1301.384487] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
> [ 1301.384491] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
> [ 1301.384497] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
> [ 1301.384501] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
> [ 1301.384504] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
> [ 1301.384509] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
> [ 1301.384514] [<ffffffff811805e1>] vfs_read+0xa1/0x180
> [ 1301.384517] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
> [ 1301.384521] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
> [ 1301.384523] ---[ end trace bab1e899ff45eea3 ]---
>
> and so on.
>
>
> $ dmesg | grep Ooops
> [ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
> [ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
> [ 1307.438935] Ooops: stime:4687858107 rtime:4687858106
> [ 1313.493462] Ooops: stime:4724166945 rtime:4724166944
> [ 1316.521740] Ooops: stime:4741142233 rtime:4741142232
> [ 1325.605625] Ooops: stime:4793906690 rtime:4793906688
> [ 1331.648611] Ooops: stime:4308413629 rtime:4308413628
> [ 1337.708468] Ooops: stime:4510156747 rtime:4510156746
> [ 1340.744586] Ooops: stime:4880584390 rtime:4880584389
> [ 1343.773111] Ooops: stime:4442005879 rtime:4442005878
> [ 1343.773362] Ooops: stime:4558310943 rtime:4558310942
> [ 1349.815797] Ooops: stime:4927379646 rtime:4927379645
> [ 1352.836936] Ooops: stime:4942782836 rtime:4942782834
> [ 1358.885232] Ooops: stime:4979031940 rtime:4979031938
> [ 1364.977902] Ooops: stime:5019633914 rtime:5019633912
> [ 1364.978283] Ooops: stime:4601969101 rtime:4601969100
> [ 1364.978623] Ooops: stime:4826393815 rtime:4826393814
> [ 1371.034265] Ooops: stime:4894272175 rtime:4894272174
> [ 1374.077150] Ooops: stime:4688831829 rtime:4688831828
> [ 1374.077519] Ooops: stime:4916319315 rtime:4916319314
> [ 1377.110416] Ooops: stime:5084174048 rtime:5084174047
> [ 1377.111043] Ooops: stime:4949250198 rtime:4949250196
> [ 1380.137382] Ooops: stime:5096963048 rtime:5096963046
> [ 1380.137739] Ooops: stime:4975014880 rtime:4975014879
> [ 1383.161441] Ooops: stime:5002466396 rtime:5002466394
> [ 1389.212166] Ooops: stime:5140400038 rtime:5140400036
> [ 1392.239303] Ooops: stime:5082051028 rtime:5082051027
> [ 1395.270391] Ooops: stime:5170118636 rtime:5170118634
> [ 1398.303593] Ooops: stime:5184355318 rtime:5184355316
Thanks a lot Sergey for testing this further!
Interesting results, so rtime is always one or two units off stime after scaling.
Stanislaw made the scaling code with Linus and he has a better idea on the math guts
here.
Stanislaw, any idea?
>
>
> -ss
next prev parent reply other threads:[~2013-09-02 13:07 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-20 11:14 [sched next] overflowed cpu time for kernel threads in /proc/PID/stat Sergey Senozhatsky
2013-08-20 15:15 ` Frederic Weisbecker
2013-08-20 15:35 ` Sergey Senozhatsky
2013-08-20 15:42 ` Frederic Weisbecker
2013-08-20 15:53 ` Sergey Senozhatsky
2013-08-20 18:48 ` Sergey Senozhatsky
2013-08-21 15:39 ` Sergey Senozhatsky
2013-08-30 23:04 ` Frederic Weisbecker
2013-08-31 19:01 ` Sergey Senozhatsky
2013-09-08 10:56 ` Frederic Weisbecker
2013-09-08 10:59 ` Frederic Weisbecker
2013-09-02 12:28 ` Sergey Senozhatsky
2013-09-02 13:07 ` Frederic Weisbecker [this message]
2013-09-02 13:39 ` Sergey Senozhatsky
2013-09-02 13:50 ` Stanislaw Gruszka
2013-09-02 14:00 ` Sergey Senozhatsky
2013-09-03 8:43 ` Stanislaw Gruszka
2013-09-03 9:33 ` Sergey Senozhatsky
2013-09-03 13:15 ` Sergey Senozhatsky
2013-09-03 18:09 ` Sergey Senozhatsky
2013-09-03 18:32 ` Ingo Molnar
2013-09-04 12:08 ` [PATCH -tip] sched/cputime: do not scale when utime == 0 Stanislaw Gruszka
2013-09-04 12:33 ` Ingo Molnar
2013-09-04 13:16 ` [PATCH -tip v2] " Stanislaw Gruszka
2013-09-04 14:31 ` Ingo Molnar
2013-09-04 14:37 ` [tip:timers/urgent] sched/cputime: Do " tip-bot for Stanislaw Gruszka
2013-09-04 15:35 ` Frederic Weisbecker
2013-09-02 14:13 ` [sched next] overflowed cpu time for kernel threads in /proc/PID/stat Frederic Weisbecker
2013-09-02 14:24 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130902130744.GB2378@somewhere \
--to=fweisbec@gmail.com \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=sgruszka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox