All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Yu <yu.c.chen@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Abel Wu <wuyun.abel@bytedance.com>,
	Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Tiwei Bie <tiwei.btw@antgroup.com>,
	Honglei Wang <wanghonglei@didichuxing.com>,
	"Aaron Lu" <aaron.lu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
	<linux-kernel@vger.kernel.org>,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [RFC PATCH] sched/eevdf: Return leftmost entity in pick_eevdf() if no eligible entity is found
Date: Mon, 8 Apr 2024 21:11:39 +0800	[thread overview]
Message-ID: <ZhPtCyRmPxa0DpMe@chenyu5-mobl2> (raw)
In-Reply-To: <20240408115833.GF21904@noisy.programming.kicks-ass.net>

On 2024-04-08 at 13:58:33 +0200, Peter Zijlstra wrote:
> On Thu, Feb 29, 2024 at 05:00:18PM +0800, Abel Wu wrote:
> 
> > > According to the log, vruntime is 18435852013561943404, the
> > > cfs_rq->min_vruntime is 763383370431, the load is 629 + 2048 = 2677,
> > > thus:
> > > s64 delta = (s64)(18435852013561943404 - 763383370431) = -10892823530978643
> > >      delta * 2677 = 7733399554989275921
> > > that is to say, the multiply result overflow the s64, which turns the
> > > negative value into a positive value, thus eligible check fails.
> > 
> > Indeed.
> 
> From the data presented it looks like min_vruntime is wrong and needs
> update. If you can readily reproduce this, dump the vruntime of all
> tasks on the runqueue and see if min_vruntime is indeed correct.
>

This was the dump of all the entities on the tree, from left to right,
and also from top down in middle order traverse, when this issue happens:

[  514.461242][ T8390] cfs_rq avg_vruntime:386638640128 avg_load:2048 cfs_rq->min_vruntime:763383370431
[  514.535935][ T8390] current on_rq se 0xc5851400, deadline:18435852013562231446
			min_vruntime:18437121115753667698 vruntime:18435852013561943404, load:629


[  514.536772][ T8390] Traverse rb-tree from left to right
[  514.537138][ T8390]  se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible  <-- leftmost se
[  514.537835][ T8390]  se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible

[  514.538539][ T8390] Traverse rb-tree from topdown
[  514.538877][ T8390]  middle se 0xec1234e0 deadline:763384870431 min_vruntime:763383370431 vruntime:763383370431 non-eligible   <-- root se
[  514.539605][ T8390]  middle se 0xec4fcf20 deadline:763762447228 min_vruntime:763760947228 vruntime:763760947228 non-eligible

The tree looks like:

          se (0xec1234e0)
                  |
                  |
                  ----> se (0xec4fcf20)


The root se 0xec1234e0 is also the leftmost se, its min_vruntime and vruntime are both 763383370431,
which is aligned with cfs_rq->min_vruntime. It seems that the cfs_rq's min_vruntime gets updated correctly,
because it is monotonic increasing.

My guess is that, for some reason, one newly forked se in a newly created task group, in the rb-tree has not
been picked for a long time(maybe not eligible). Its vruntime stopped at the negative value(near (unsigned long)(-(1LL << 20))
for a long time, its vruntime is long behind the cfs_rq->vruntime, thus the overflow happens.


thanks,
Chenyu

> > > So where is this insane huge vruntime 18435852013561943404 coming from?
> > > My guess is that, it is because the initial value of cfs_rq->min_vruntime
> > > is set to (unsigned long)(-(1LL << 20)). If the task(watchdog in this case)
> > > seldom scheduled in, its vruntime might not move forward too much and
> > > remain its original value by previous place_entity().
> > 
> > So why not just initialize to 0? The (unsigned long)(-(1LL << 20))
> > thing is dangerous as it can easily blow up lots of calculations in
> > lag, key, avg_vruntime and so on.
> 
> The reason is to ensure the wrap-around logic works -- which it must,
> because with the weighting thing, the vruntime can wrap quite quickly,
> something like one day IIRC (20 bit for precision etc.)
> 
> Better to have the wrap around happen quickly after boot and have
> everybody suffer, rather than have it be special and hard to reproduce.

  reply	other threads:[~2024-04-08 13:11 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-26  8:23 [RFC PATCH] sched/eevdf: Return leftmost entity in pick_eevdf() if no eligible entity is found Chen Yu
2024-02-28  9:04 ` Xuewen Yan
2024-02-28 15:24   ` Chen Yu
2024-02-29 12:10     ` Xuewen Yan
2024-03-01  6:46       ` Chen Yu
2024-02-29  9:00 ` Abel Wu
2024-03-01  7:07   ` Chen Yu
2024-03-01  8:42     ` Abel Wu
2024-04-08 12:00       ` Peter Zijlstra
2024-04-08 11:58   ` Peter Zijlstra
2024-04-08 13:11     ` Chen Yu [this message]
2024-04-09  9:21       ` Peter Zijlstra
2024-04-15  7:22         ` Peter Zijlstra
2024-04-15  8:03           ` Chen Yu
2024-04-17 18:34         ` Chen Yu
2024-04-18  2:57           ` Xuewen Yan
2024-04-18  3:08             ` Chen Yu
2024-04-18  3:37               ` Tianchen Ding
2024-04-18  5:52                 ` Chen Yu
2024-04-18  6:16                   ` Tianchen Ding
2024-04-18 13:03             ` Chen Yu
2024-04-18 23:45               ` Tim Chen
2024-04-19  8:24               ` Peter Zijlstra
2024-04-19  8:45                 ` Peter Zijlstra
2024-04-19  9:20                   ` Xuewen Yan
2024-04-19  9:17                 ` Xuewen Yan
2024-04-19 10:04                 ` Chen Yu
2024-04-19 16:24                   ` Peter Zijlstra
2024-04-19 17:22                     ` Chen Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhPtCyRmPxa0DpMe@chenyu5-mobl2 \
    --to=yu.c.chen@intel.com \
    --cc=aaron.lu@intel.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@intel.com \
    --cc=tiwei.btw@antgroup.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wanghonglei@didichuxing.com \
    --cc=wuyun.abel@bytedance.com \
    --cc=yu.chen.surf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.