public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chen Yu <yu.c.chen@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Abel Wu <wuyun.abel@bytedance.com>,
	Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Tiwei Bie <tiwei.btw@antgroup.com>,
	Honglei Wang <wanghonglei@didichuxing.com>,
	"Aaron Lu" <aaron.lu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
	<linux-kernel@vger.kernel.org>,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [RFC PATCH] sched/eevdf: Return leftmost entity in pick_eevdf() if no eligible entity is found
Date: Mon, 8 Apr 2024 21:11:39 +0800	[thread overview]
Message-ID: <ZhPtCyRmPxa0DpMe@chenyu5-mobl2> (raw)
In-Reply-To: <20240408115833.GF21904@noisy.programming.kicks-ass.net>

On 2024-04-08 at 13:58:33 +0200, Peter Zijlstra wrote:
> On Thu, Feb 29, 2024 at 05:00:18PM +0800, Abel Wu wrote:
> 
> > > According to the log, vruntime is 18435852013561943404, the
> > > cfs_rq->min_vruntime is 763383370431, the load is 629 + 2048 = 2677,
> > > thus:
> > > s64 delta = (s64)(18435852013561943404 - 763383370431) = -10892823530978643
> > >      delta * 2677 = 7733399554989275921
> > > that is to say, the multiply result overflow the s64, which turns the
> > > negative value into a positive value, thus eligible check fails.
> > 
> > Indeed.
> 
> From the data presented it looks like min_vruntime is wrong and needs
> update. If you can readily reproduce this, dump the vruntime of all
> tasks on the runqueue and see if min_vruntime is indeed correct.
>

This was the dump of all the entities on the tree, from left to right,
and also from top down in middle order traverse, when this issue happens:

[  514.461242][ T8390] cfs_rq avg_vruntime:386638640128 avg_load:2048 cfs_rq->min_vruntime:763383370431
[  514.535935][ T8390] current on_rq se 0xc5851400, deadline:18435852013562231446
			min_vruntime:18437121115753667698 vruntime:18435852013561943404, load:629


[  514.536772][ T8390] Traverse rb-tree from left to right
[  514.537138][ T8390]  se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible  <-- leftmost se
[  514.537835][ T8390]  se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible

[  514.538539][ T8390] Traverse rb-tree from topdown
[  514.538877][ T8390]  middle se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible   <-- root se
[  514.539605][ T8390]  middle se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible

The tree looks like:

          se (0xec1234e0)
                  |
                  |
                  ----> se (0xec4fcf20)


The root se 0xec1234e0 is also the leftmost se, its min_vruntime and vruntime are both 763383370431,
which is aligned with cfs_rq->min_vruntime. It seems that the cfs_rq's min_vruntime gets updated correctly,
because it is monotonic increasing.

My guess is that, for some reason, one newly forked se in a newly created task group, in the rb-tree has not
been picked for a long time(maybe not eligible). Its vruntime stopped at the negative value(near (unsigned long)(-(1LL << 20))
for a long time, its vruntime is long behind the cfs_rq->vruntime, thus the overflow happens.


thanks,
Chenyu

> > > So where is this insane huge vruntime 18435852013561943404 coming from?
> > > My guess is that, it is because the initial value of cfs_rq->min_vruntime
> > > is set to (unsigned long)(-(1LL << 20)). If the task(watchdog in this case)
> > > seldom scheduled in, its vruntime might not move forward too much and
> > > remain its original value by previous place_entity().
> > 
> > So why not just initialize to 0? The (unsigned long)(-(1LL << 20))
> > thing is dangerous as it can easily blow up lots of calculations in
> > lag, key, avg_vruntime and so on.
> 
> The reason is to ensure the wrap-around logic works -- which it must,
> because with the weighting thing, the vruntime can wrap quite quickly,
> something like one day IIRC (20 bit for precision etc.)
> 
> Better to have the wrap around happen quickly after boot and have
> everybody suffer, rather than have it be special and hard to reproduce.

  reply	other threads:[~2024-04-08 13:11 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-26  8:23 [RFC PATCH] sched/eevdf: Return leftmost entity in pick_eevdf() if no eligible entity is found Chen Yu
2024-02-28  9:04 ` Xuewen Yan
2024-02-28 15:24   ` Chen Yu
2024-02-29 12:10     ` Xuewen Yan
2024-03-01  6:46       ` Chen Yu
2024-02-29  9:00 ` Abel Wu
2024-03-01  7:07   ` Chen Yu
2024-03-01  8:42     ` Abel Wu
2024-04-08 12:00       ` Peter Zijlstra
2024-04-08 11:58   ` Peter Zijlstra
2024-04-08 13:11     ` Chen Yu [this message]
2024-04-09  9:21       ` Peter Zijlstra
2024-04-15  7:22         ` Peter Zijlstra
2024-04-15  8:03           ` Chen Yu
2024-04-17 18:34         ` Chen Yu
2024-04-18  2:57           ` Xuewen Yan
2024-04-18  3:08             ` Chen Yu
2024-04-18  3:37               ` Tianchen Ding
2024-04-18  5:52                 ` Chen Yu
2024-04-18  6:16                   ` Tianchen Ding
2024-04-18 13:03             ` Chen Yu
2024-04-18 23:45               ` Tim Chen
2024-04-19  8:24               ` Peter Zijlstra
2024-04-19  8:45                 ` Peter Zijlstra
2024-04-19  9:20                   ` Xuewen Yan
2024-04-19  9:17                 ` Xuewen Yan
2024-04-19 10:04                 ` Chen Yu
2024-04-19 16:24                   ` Peter Zijlstra
2024-04-19 17:22                     ` Chen Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhPtCyRmPxa0DpMe@chenyu5-mobl2 \
    --to=yu.c.chen@intel.com \
    --cc=aaron.lu@intel.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@intel.com \
    --cc=tiwei.btw@antgroup.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wanghonglei@didichuxing.com \
    --cc=wuyun.abel@bytedance.com \
    --cc=yu.chen.surf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox