From: Chen Yu <yu.c.chen@intel.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Roman Kagan <rkagan@amazon.de>,
Peter Zijlstra <peterz@infradead.org>,
Zhang Qiao <zhangqiao22@huawei.com>,
Waiman Long <longman@redhat.com>,
"Ingo Molnar" <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
"Daniel Bristot de Oliveira" <bristot@redhat.com>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: [bug-report] possible s64 overflow in max_vruntime()
Date: Wed, 1 Feb 2023 20:52:26 +0800 [thread overview]
Message-ID: <Y9pgitjZHTkbssxV@chenyu5-mobl1> (raw)
In-Reply-To: <CAKfTPtDUMph262w5OSiSQi-BVcNRf2gN=PdmxYCKEuk-8aYhgA@mail.gmail.com>
On 2023-01-31 at 12:10:29 +0100, Vincent Guittot wrote:
> On Tue, 31 Jan 2023 at 11:00, Roman Kagan <rkagan@amazon.de> wrote:
> >
> > On Tue, Jan 31, 2023 at 11:21:17AM +0800, Chen Yu wrote:
> > > On 2023-01-27 at 17:18:56 +0100, Vincent Guittot wrote:
> > > > On Fri, 27 Jan 2023 at 12:44, Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Thu, Jan 26, 2023 at 07:31:02PM +0100, Roman Kagan wrote:
> > > > >
> > > > > > > All that only matters for small sleeps anyway.
> > > > > > >
> > > > > > > Something like:
> > > > > > >
> > > > > > > sleep_time = U64_MAX;
> > > > > > > if (se->avg.last_update_time)
> > > > > > > sleep_time = cfs_rq_clock_pelt(cfs_rq) - se->avg.last_update_time;
> > > > > >
> > > > > > Interesting, why not rq_clock_task(rq_of(cfs_rq)) - se->exec_start, as
> > > > > > others were suggesting? It appears to better match the notion of sleep
> > > > > > wall-time, no?
> > > > >
> > > > > Should also work I suppose. cfs_rq_clock takes throttling into account,
> > > > > but that should hopefully also not be *that* long, so either should
> > > > > work.
> > > >
> > > > yes rq_clock_task(rq_of(cfs_rq)) should be fine too
> > > >
> > > > Another thing to take into account is the sleeper credit that the
> > > > waking task deserves so the detection should be done once it has been
> > > > subtracted from vruntime.
> > > >
> > > > Last point, when a nice -20 task runs on a rq, it will take a bit more
> > > > than 2 seconds for the vruntime to be increased by more than 24ms (the
> > > > maximum credit that a waking task can get) so threshold must be
> > > > significantly higher than 2 sec. On the opposite side, the lowest
> > > > possible weight of a cfs rq is 2 which means that the problem appears
> > > > for a sleep longer or equal to 2^54 = 2^63*2/1024. We should use this
> > > > value instead of an arbitrary 200 days
> > > Does it mean any threshold between 2 sec and 2^54 nsec should be fine? Because
> > > 1. Any task sleeps longer than 2 sec will get at most 24 ms(sysctl_sched_latency)
> > > 'vruntime bonus' when enqueued.
>
> This means that if a task nice -20 runs on cfs rq while your task is
> sleeping 2seconds, the min vruntime of the cfs rq will increase by
> 24ms. If there are 2 nice -20 tasks then the min vruntime will
> increase by 24ms after 4 seconds and so on ...
>
Got it, thanks for this example.
> On the other side, a task nice 19 that runs 1ms will increase its
> vruntime by around 68ms.
>
> So if there is 1 task nice 19 with 11 tasks nice -20 on the same cfs
> rq, the nice -19 one should run 1ms every 65 seconds and this also
I assume that you were refering to nice 19 task, and also the following
'-19'.
> means that the vruntime of task nice -19 should still be above
> min_vruntime after sleeping 60 seconds.
So even if the -19 task sleeps very long, the cfs_rq->min_vruntime can not
take the lead, the overflow of s64(min_vruntime - se->vruntime) will not happen.
> Of course this is even worse
> with a child cgroup with the lowest weight (weight of 2 instead of 15)
>
> Just to say that 60 seconds is not so far away and 2^54 should be better IMHO
>
2^54 could be the "eailiest" interval that could trigger the s64 overflow(because other
weight > 2 will not trigger overflow when sleeping for 2^54).
thanks,
Chenyu
next prev parent reply other threads:[~2023-02-01 12:52 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-21 15:19 [bug-report] possible s64 overflow in max_vruntime() Zhang Qiao
2022-12-21 16:10 ` Waiman Long
2022-12-22 12:45 ` Peter Zijlstra
2022-12-23 13:57 ` Zhang Qiao
2023-01-12 3:01 ` Zhang Qiao
2023-01-25 19:57 ` Roman Kagan
2023-01-25 19:45 ` Roman Kagan
2023-01-26 12:49 ` Peter Zijlstra
2023-01-26 18:31 ` Roman Kagan
2023-01-27 11:44 ` Peter Zijlstra
2023-01-27 16:18 ` Vincent Guittot
2023-01-27 22:10 ` Benjamin Segall
2023-01-27 22:29 ` Vincent Guittot
2023-01-31 3:21 ` Chen Yu
2023-01-31 9:59 ` Roman Kagan
2023-01-31 11:10 ` Vincent Guittot
2023-02-01 12:52 ` Chen Yu [this message]
2023-02-07 19:37 ` Roman Kagan
2023-02-08 10:13 ` Vincent Guittot
2023-02-08 18:09 ` Roman Kagan
2023-02-09 11:26 ` Vincent Guittot
2023-02-09 13:33 ` Roman Kagan
2023-02-09 13:44 ` Vincent Guittot
2023-02-09 14:34 ` Roman Kagan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9pgitjZHTkbssxV@chenyu5-mobl1 \
--to=yu.c.chen@intel.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rkagan@amazon.de \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=zhangqiao22@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.