Re: [PATCH v2 00/12] sched: Address schbench regression

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, vschneid@redhat.com,
	clm@meta.com, Madhavan Srinivasan <maddy@linux.ibm.com>
Subject: Re: [PATCH v2 00/12] sched: Address schbench regression
Date: Wed, 9 Jul 2025 22:16:14 +0530	[thread overview]
Message-ID: <d5cb15bd-1096-45a8-9da6-a37ff490714c@linux.ibm.com> (raw)
In-Reply-To: <20250708190201.GE477119@noisy.programming.kicks-ass.net>



On 7/9/25 00:32, Peter Zijlstra wrote:
> On Mon, Jul 07, 2025 at 11:49:17PM +0530, Shrikanth Hegde wrote:
> 
>> Git bisect points to
>> # first bad commit: [dc968ba0544889883d0912360dd72d90f674c140] sched: Add ttwu_queue support for delayed tasks
> 
> Moo.. Are IPIs particularly expensive on your platform?
> 
> The 5 cores makes me think this is a partition of sorts, but IIRC the
> power LPAR stuff was fixed physical, so routing interrupts shouldn't be
> much more expensive vs native hardware.
> 

Yes, we call it as dedicated LPAR. (Hypervisor optimises such that overhead is minimal,
i think that i true for interrupts too).


Some more variations of testing and numbers:

The system had some configs which i had messed up such as CONFIG_SCHED_SMT=n. I copied the default
distro config back and ran the benchmark again. Slightly better numbers compared to earlier.
Still a major regression. Collected mpstat numbers. It shows much less percentage compared to
earlier.

--------------------------------------------------------------------------
base: 8784fb5fa2e0 (tip/master)

Wakeup Latencies percentiles (usec) runtime 30 (s) (41567569 total samples)
           50.0th: 11         (10767158 samples)
           90.0th: 22         (16782627 samples)
         * 99.0th: 36         (3347363 samples)
           99.9th: 52         (344977 samples)
           min=1, max=731
RPS percentiles (requests) runtime 30 (s) (31 total samples)
           20.0th: 1443840    (31 samples)
         * 50.0th: 1443840    (0 samples)
           90.0th: 1443840    (0 samples)
           min=1433480, max=1444037
average rps: 1442889.23

CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all    3.24    0.00   11.39    0.00   37.30    0.00    0.00    0.00    0.00   48.07
all    2.59    0.00   11.56    0.00   37.62    0.00    0.00    0.00    0.00   48.23



base + clm's patch + series:
Wakeup Latencies percentiles (usec) runtime 30 (s) (27166787 total samples)
           50.0th: 57         (8242048 samples)
           90.0th: 120        (10677365 samples)
         * 99.0th: 182        (2435082 samples)
           99.9th: 262        (241664 samples)
           min=1, max=89984
RPS percentiles (requests) runtime 30 (s) (31 total samples)
           20.0th: 896000     (8 samples)
         * 50.0th: 902144     (10 samples)
           90.0th: 928768     (10 samples)
           min=881548, max=971101
average rps: 907530.10                                               <<< close to 40% drop in RPS.

CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all    1.95    0.00    7.67    0.00   14.84    0.00    0.00    0.00    0.00   75.55
all    1.61    0.00    7.91    0.00   13.53    0.05    0.00    0.00    0.00   76.90

-----------------------------------------------------------------------------

- To be sure, I tried on another system. That system had 30 cores.

base:
Wakeup Latencies percentiles (usec) runtime 30 (s) (40339785 total samples)
           50.0th: 12         (12585268 samples)
           90.0th: 24         (15194626 samples)
         * 99.0th: 44         (3206872 samples)
           99.9th: 59         (320508 samples)
           min=1, max=1049
RPS percentiles (requests) runtime 30 (s) (31 total samples)
           20.0th: 1320960    (14 samples)
         * 50.0th: 1333248    (2 samples)
           90.0th: 1386496    (12 samples)
           min=1309615, max=1414281

base + clm's patch + series:
Wakeup Latencies percentiles (usec) runtime 30 (s) (34318584 total samples)
           50.0th: 23         (10486283 samples)
           90.0th: 64         (13436248 samples)
         * 99.0th: 122        (3039318 samples)
           99.9th: 166        (306231 samples)
           min=1, max=7255
RPS percentiles (requests) runtime 30 (s) (31 total samples)
           20.0th: 1006592    (8 samples)
         * 50.0th: 1239040    (9 samples)
           90.0th: 1259520    (11 samples)
           min=852462, max=1268841
average rps: 1144229.23                                             << close 10-15% drop in RPS


- Then I resized that 30 core LPAR into a 5 core LPAR to see if the issue pops up in a smaller
config. It did. I see similar regression of 40-50% drop in RPS.

- Then I made it as 6 core system. To see if this is due to any ping pong because of odd numbers.
Numbers are similar to 5 core case.

- Maybe regressions is higher in smaller configurations.

next prev parent reply	other threads:[~2025-07-09 16:47 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-02 11:49 [PATCH v2 00/12] sched: Address schbench regression Peter Zijlstra
2025-07-02 11:49 ` [PATCH v2 01/12] sched/psi: Optimize psi_group_change() cpu_clock() usage Peter Zijlstra
2025-07-15 19:11   ` Chris Mason
2025-07-16  6:06     ` K Prateek Nayak
2025-07-16  6:53     ` Beata Michalska
2025-07-16 10:40       ` Peter Zijlstra
2025-07-16 14:54         ` Johannes Weiner
2025-07-16 16:27         ` Chris Mason
2025-07-23  4:16         ` Aithal, Srikanth
2025-07-25  5:13         ` K Prateek Nayak
2025-07-02 11:49 ` [PATCH v2 02/12] sched/deadline: Less agressive dl_server handling Peter Zijlstra
2025-07-02 16:12   ` Juri Lelli
2025-07-10 12:46   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2025-07-14 22:56   ` [PATCH v2 02/12] " Mel Gorman
2025-07-15 14:55     ` Chris Mason
2025-07-16 18:19       ` Mel Gorman
2025-07-30  9:34   ` Geert Uytterhoeven
2025-07-30  9:46     ` Juri Lelli
2025-07-30 10:05       ` Geert Uytterhoeven
2025-08-05 22:03   ` Chris Bainbridge
2025-08-05 23:04     ` Chris Bainbridge
2025-09-15 22:29   ` John Stultz
2025-09-16  4:18     ` John Stultz
2025-09-16  5:28       ` [RFC][PATCH] sched/deadline: Fix dl_server getting stuck, allowing cpu starvation John Stultz
2025-09-16  8:51         ` Juri Lelli
2025-09-16 11:01           ` Peter Zijlstra
2025-09-16 12:52             ` Juri Lelli
2025-09-16 14:30               ` Peter Zijlstra
2025-09-16 17:35             ` John Stultz
2025-09-16 21:30               ` Peter Zijlstra
2025-09-17  3:29                 ` John Stultz
2025-09-17  9:34                   ` Peter Zijlstra
2025-09-17 12:26                     ` Peter Zijlstra
2025-07-02 11:49 ` [PATCH v2 03/12] sched: Optimize ttwu() / select_task_rq() Peter Zijlstra
2025-07-10 16:47   ` Vincent Guittot
2025-07-14 22:59   ` Mel Gorman
2025-07-02 11:49 ` [PATCH v2 04/12] sched: Use lock guard in ttwu_runnable() Peter Zijlstra
2025-07-10 16:48   ` Vincent Guittot
2025-07-14 23:00   ` Mel Gorman
2025-07-02 11:49 ` [PATCH v2 05/12] sched: Add ttwu_queue controls Peter Zijlstra
2025-07-10 16:51   ` Vincent Guittot
2025-07-14 23:14   ` Mel Gorman
2025-07-02 11:49 ` [PATCH v2 06/12] sched: Introduce ttwu_do_migrate() Peter Zijlstra
2025-07-10 16:51   ` Vincent Guittot
2025-07-02 11:49 ` [PATCH v2 07/12] psi: Split psi_ttwu_dequeue() Peter Zijlstra
2025-07-17 23:59   ` Chris Mason
2025-07-18 18:02     ` Steven Rostedt
2025-07-02 11:49 ` [PATCH v2 08/12] sched: Re-arrange __ttwu_queue_wakelist() Peter Zijlstra
2025-07-02 11:49 ` [PATCH v2 09/12] sched: Clean up ttwu comments Peter Zijlstra
2025-07-02 11:49 ` [PATCH v2 10/12] sched: Use lock guard in sched_ttwu_pending() Peter Zijlstra
2025-07-10 16:51   ` Vincent Guittot
2025-07-02 11:49 ` [PATCH v2 11/12] sched: Change ttwu_runnable() vs sched_delayed Peter Zijlstra
2025-07-02 11:49 ` [PATCH v2 12/12] sched: Add ttwu_queue support for delayed tasks Peter Zijlstra
2025-07-03 16:00   ` Phil Auld
2025-07-03 16:47     ` Peter Zijlstra
2025-07-03 17:11       ` Phil Auld
2025-07-14 13:57         ` Phil Auld
2025-07-04  6:13       ` K Prateek Nayak
2025-07-04  7:59         ` Peter Zijlstra
2025-07-08 12:44   ` Dietmar Eggemann
2025-07-08 18:57     ` Peter Zijlstra
2025-07-08 21:02     ` Peter Zijlstra
2025-07-23  5:42   ` Shrikanth Hegde
2025-07-02 15:27 ` [PATCH v2 00/12] sched: Address schbench regression Chris Mason
2025-07-07  9:05 ` Shrikanth Hegde
2025-07-07  9:11   ` Peter Zijlstra
2025-07-07  9:38     ` Shrikanth Hegde
2025-07-16 13:46       ` Phil Auld
2025-07-17 17:25         ` Phil Auld
2025-07-07 18:19   ` Shrikanth Hegde
2025-07-08 19:02     ` Peter Zijlstra
2025-07-09 16:46       ` Shrikanth Hegde [this message]
2025-07-14 17:54       ` Shrikanth Hegde
2025-07-21 19:37       ` Shrikanth Hegde
2025-07-22 20:20         ` Chris Mason
2025-07-24 18:23           ` Chris Mason
2025-07-08 15:09   ` Chris Mason
2025-07-08 17:29     ` Shrikanth Hegde
2025-07-17 13:04 ` Beata Michalska
2025-07-17 16:57   ` Beata Michalska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5cb15bd-1096-45a8-9da6-a37ff490714c@linux.ibm.com \
    --to=sshegde@linux.ibm.com \
    --cc=bsegall@google.com \
    --cc=clm@meta.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).