All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Doug Smythies" <dsmythies@telus.net>
To: "'Peter Zijlstra'" <peterz@infradead.org>,
	"'K Prateek Nayak'" <kprateek.nayak@amd.com>
Cc: <mingo@kernel.org>, <juri.lelli@redhat.com>,
	<vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>,
	<rostedt@goodmis.org>, <bsegall@google.com>, <mgorman@suse.de>,
	<vschneid@redhat.com>, <linux-kernel@vger.kernel.org>,
	<wangtao554@huawei.com>, <quzicheng@huawei.com>,
	<wuyun.abel@bytedance.com>, "Doug Smythies" <dsmythies@telus.net>
Subject: RE: [PATCH 0/4] sched: Various reweight_entity() fixes
Date: Tue, 10 Feb 2026 07:41:58 -0800	[thread overview]
Message-ID: <001901dc9aa3$cbad47f0$6307d7d0$@telus.net> (raw)
In-Reply-To: <20260209154718.GW1282955@noisy.programming.kicks-ass.net>

On 2026.02.09.07:47 Peter Zijlstra wrote:
> On Wed, Feb 04, 2026 at 03:45:58PM +0530, K Prateek Nayak wrote:
>
>>        # Overflow on enqueue
>> 
>>            <...>-102371  [255] ... : __enqueue_entity: Overflowed cfs_rq:
>>            <...>-102371  [255] ... : dump_h_overflow_cfs_rq: cfs_rq: depth(0) weight(90894772) nr_queued(2) sum_w_vruntime(0)
sum_weight(0) zero_vruntime(701164930256050) sum_shift(0) avg_vruntime(701809615900788)
>>            <...>-102371  [255] ... : dump_h_overflow_entity: se: weight(3508) vruntime(701809615900788) slice(2800000)
deadline(701810568648095) curr?(1) task?(1)       <-------- cfs_rq->curr
>>            <...>-102371  [255] ... : __enqueue_entity: Overflowed se:
>>            <...>-102371  [255] ... : dump_h_overflow_entity: se: weight(90891264) vruntime(701808975077099) slice(2800000)
deadline(701808975109401) curr?(0) task?(0)   <-------- new se
>
> So I spend a whole time trying to reproduce the splat, but alas.
>
> That said, I did spot something 'funny' in the above, note that
> zero_vruntime and avg_vruntime/curr->vruntime are significantly apart.
> That is not something that should happen. zero_vruntime is supposed to
> closely track avg_vruntime.
>
> That lead me to hypothesise that there is a problem tracking
> zero_vruntime when there is but a single runnable task, and sure
> enough, I could reproduce that, albeit not at such a scale as to lead to
> such problems (probably too much noise on my machine).
>
> I ended up with the below; and I've already pushed out a fresh
> queue/sched/core. Could you please test again?

I tested this "V2". The CPU migration times test results are not good.
We expect the sample time to not deviate from the nominal 1 second
by more than 10 milliseconds for this test. The test ran for about
13 hours and 41 minutes (49,243 samples). Histogram of times:

kernel: 6.19.0-rc8-pz-v2
gov: powersave
HWP: enabled

1.000, 29206
1.001, 19598
1.002, 19
1.003, 15
1.004, 32
1.005, 25
1.006, 3
1.007, 3
1.008, 5
1.009, 5
1.010, 13
1.011, 14
1.012, 13
1.013, 6
1.014, 10
1.015, 16
1.016, 54
1.017, 116
1.018, 57
1.019, 7
1.020, 2
1.021, 1
1.023, 2
1.024, 4
1.025, 7
1.026, 1
1.027, 1
1.028, 1
1.029, 1
1.030, 2
1.037, 1

Total: 49240 : Total >= 10 mSec: 329 ( 0.67 percent)

For reference previous test results are copied and pasted below.

Step 1: Confirm where we left off a year ago:

The exact same kernel from a year ago, that we ended up happy with, was used.

doug@s19:~/tmp/peterz/6.19/turbo$ cat 613.his
Kernel: 6.13.0-stock
gov: powersave
HWP: enabled

1.000000, 23195
1.001000, 10897
1.002000, 49
1.003000, 23
1.004000, 21
1.005000, 9

Total: 34194 : Total >= 10 mSec: 0 ( 0.00 percent)

So, over 9 hours and never a nominal sample time exceeded by over 5 milliseconds.
Very good.

Step 2: Take a baseline sample before this patch set:
Mainline kernel 6.19-rc1 was used:

doug@s19:~/tmp/peterz/6.19/turbo$ cat rc1.his
Kernel: 6.19.0-rc1-stock
gov: powersave
HWP: enabled

1.000000, 19509
1.001000, 10430
1.002000, 32
1.003000, 19
1.004000, 24
1.005000, 13
1.006000, 9
1.007000, 4
1.008000, 3
1.009000, 4
1.010000, 6
1.011000, 2
1.012000, 1
1.013000, 4
1.014000, 10
1.015000, 10
1.016000, 7
1.017000, 10
1.018000, 20
1.019000, 12
1.020000, 5
1.021000, 3
1.022000, 1
1.023000, 2
1.024000, 2  <<< Clamped. Actually 26 and 25 milliseconds

Total: 30142 : Total >= 10 mSec: 95 ( 0.32 percent)

What!!!
Over 8 hours.
It seems something has regressed over the last year.
Our threshold of 10 milliseconds was rather arbitrary.

Step 3: This patch set [V1] and from Peter's git tree:

doug@s19:~/tmp/peterz/6.19/turbo$ cat 02.his
kernel: 6.19.0-rc1-pz
gov: powersave
HWP: enabled

1.000000, 19139
1.001000, 9532
1.002000, 19
1.003000, 17
1.004000, 8
1.005000, 3
1.006000, 2
1.009000, 1

Total: 28721 : Total >= 10 mSec: 0 ( 0.00 percent)

Just about 8 hours.
Never a time >= our arbitrary threshold of 10 milliseconds.
So, good.

My test computer also hung under the heavy heavy load test,
albeit at a higher load than before.
There was no log information that I could find after the re-boot.

References:
https://lore.kernel.org/lkml/000d01dc939e$0fc99fe0$2f5cdfa0$@telus.net/
https://lore.kernel.org/lkml/005f01db5a44$3bb698e0$b323caa0$@telus.net/
https://lore.kernel.org/lkml/004a01dc952b$471c94a0$d555bde0$@telus.net/



  parent reply	other threads:[~2026-02-10 15:42 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-30  9:34 [PATCH 0/4] sched: Various reweight_entity() fixes Peter Zijlstra
2026-01-30  9:34 ` [PATCH 1/4] sched/fair: Only set slice protection at pick time Peter Zijlstra
2026-01-30 15:52   ` Vincent Guittot
2026-01-30  9:34 ` [PATCH 2/4] sched/eevdf: Update se->vprot in reweight_entity() Peter Zijlstra
2026-01-30 16:20   ` Vincent Guittot
2026-01-30  9:34 ` [PATCH 3/4] sched/fair: Increase weight bits for avg_vruntime Peter Zijlstra
2026-01-30  9:34 ` [PATCH 4/4] sched/fair: Revert 6d71a9c61604 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag") Peter Zijlstra
2026-01-31  1:47   ` Zhang Qiao
2026-01-31 15:21     ` Peter Zijlstra
2026-02-02  9:12       ` Peter Zijlstra
2026-02-02  9:24         ` Peter Zijlstra
2026-02-02 11:23         ` Zhang Qiao
2026-02-01 17:13 ` [PATCH 0/4] sched: Various reweight_entity() fixes Doug Smythies
2026-02-03  6:45 ` K Prateek Nayak
2026-02-03 11:11   ` Peter Zijlstra
2026-02-03 12:19     ` K Prateek Nayak
2026-02-03 16:36       ` Doug Smythies
2026-02-10 18:13         ` Peter Zijlstra
2026-02-11  5:51           ` Doug Smythies
2026-02-04 10:15       ` K Prateek Nayak
2026-02-09 15:47         ` Peter Zijlstra
2026-02-09 16:52           ` K Prateek Nayak
2026-02-10  5:16           ` K Prateek Nayak
2026-02-10 10:29             ` Peter Zijlstra
2026-02-10 15:41           ` Doug Smythies [this message]
2026-02-10 18:09             ` K Prateek Nayak
2026-02-10 18:35               ` Peter Zijlstra
2026-02-10 20:04                 ` K Prateek Nayak
2026-02-11  6:28                   ` K Prateek Nayak
2026-02-11  8:50                     ` K Prateek Nayak
2026-02-11 23:09               ` Doug Smythies
2026-02-10 18:52             ` Peter Zijlstra
2026-02-10 20:52           ` Vincent Guittot
2026-02-11  5:21             ` Doug Smythies
2026-02-11  8:49               ` Vincent Guittot
2026-02-11  9:01                 ` Peter Zijlstra
2026-02-11 10:48                   ` Peter Zijlstra
2026-02-11 10:49                     ` Peter Zijlstra
2026-02-11 11:15                     ` Vincent Guittot
2026-02-11 16:28                       ` Peter Zijlstra
2026-02-12  7:43                         ` K Prateek Nayak
2026-02-12 11:59                           ` Peter Zijlstra
2026-02-12 17:16                             ` Peter Zijlstra
2026-02-12 17:24                               ` Vincent Guittot
2026-02-12 19:31                                 ` Peter Zijlstra
2026-02-13  5:22                                   ` K Prateek Nayak
2026-02-13  6:44                                     ` Peter Zijlstra
2026-02-13 10:50                                       ` Peter Zijlstra
2026-02-13 14:29                                         ` K Prateek Nayak
2026-02-14  6:31                                         ` Doug Smythies
2026-02-21 22:51                                           ` Doug Smythies
2026-02-12 19:29                               ` Peter Zijlstra
2026-02-12 19:37                                 ` Doug Smythies
2026-02-13  6:04                                 ` K Prateek Nayak
2026-02-11 16:21                     ` Peter Zijlstra
2026-02-12  5:54                       ` Doug Smythies
2026-02-12  7:51                         ` Peter Zijlstra
2026-02-12 15:47                           ` Doug Smythies
2026-02-12  7:46                       ` Peter Zijlstra
2026-02-11 23:25                     ` Doug Smythies
2026-02-11  8:48             ` Peter Zijlstra
2026-02-04 10:44       ` Peter Zijlstra
2026-02-14  7:20 ` Shubhang Kaushik
2026-02-16  3:14   ` K Prateek Nayak
2026-02-16 10:59     ` Dietmar Eggemann
2026-02-17 14:37       ` Dietmar Eggemann
2026-02-17 22:02     ` Shubhang Kaushik
2026-02-17  4:20 ` K Prateek Nayak
2026-02-18 18:37 ` Shubhang Kaushik
2026-02-19  7:53   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='001901dc9aa3$cbad47f0$6307d7d0$@telus.net' \
    --to=dsmythies@telus.net \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=quzicheng@huawei.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wangtao554@huawei.com \
    --cc=wuyun.abel@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.