Re: [PATCH 06/17] sched/fair: Add lag based placement

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Chen Yu <yu.c.chen@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <mingo@kernel.org>, <vincent.guittot@linaro.org>,
	<linux-kernel@vger.kernel.org>, <juri.lelli@redhat.com>,
	<dietmar.eggemann@arm.com>, <rostedt@goodmis.org>,
	<bsegall@google.com>, <mgorman@suse.de>, <bristot@redhat.com>,
	<corbet@lwn.net>, <qyousef@layalina.io>, <chris.hyser@oracle.com>,
	<patrick.bellasi@matbug.net>, <pjt@google.com>, <pavel@ucw.cz>,
	<qperret@google.com>, <tim.c.chen@linux.intel.com>,
	<joshdon@google.com>, <timj@gnu.org>, <kprateek.nayak@amd.com>,
	<youssefesmat@chromium.org>, <joel@joelfernandes.org>,
	<efault@gmx.de>
Subject: Re: [PATCH 06/17] sched/fair: Add lag based placement
Date: Thu, 13 Apr 2023 23:42:34 +0800	[thread overview]
Message-ID: <ZDgi6g4hRYCfbxcu@chenyu5-mobl1> (raw)
In-Reply-To: <20230405094720.GA4253@hirez.programming.kicks-ass.net>

On 2023-04-05 at 11:47:20 +0200, Peter Zijlstra wrote:
> On Mon, Apr 03, 2023 at 05:18:06PM +0800, Chen Yu wrote:
> > On 2023-03-28 at 11:26:28 +0200, Peter Zijlstra wrote:
So I launched the test on another platform with more CPUs,

baseline: 6.3-rc6

compare:  sched/eevdf branch on top of commit 8c59a975d5ee ("sched/eevdf: Debug / validation crud")


--------------------------------------------------------------------------------------
schbench:mthreads = 2
                   baseline                    eevdf+NO_PLACE_BONUS
worker_threads
25%                80.00           +19.2%      95.40            schbench.latency_90%_us
                   (0.00%)                     (0.51%)          stddev
50%                183.70          +2.2%       187.80           schbench.latency_90%_us
                   (0.35%)                     (0.46%)          stddev
75%                4065            -21.4%      3193             schbench.latency_90%_us
                   (69.65%)                    (3.42%)          stddev
100%               13696           -92.4%      1040             schbench.latency_90%_us
                   (5.25%)                     (69.03%)         stddev
125%               16457           -78.6%      3514             schbench.latency_90%_us
                   (10.50%)                    (6.25%)          stddev
150%               31177           -77.5%      7008             schbench.latency_90%_us
                   (6.84%)                     (5.19%)          stddev
175%               40729           -75.1%      10160            schbench.latency_90%_us
                   (6.11%)                     (2.53%)          stddev
200%               52224           -74.4%      13385            schbench.latency_90%_us
                   (10.42%)                    (1.72%)          stddev


                  eevdf+NO_PLACE_BONUS       eevdf+PLACE_BONUS
worker_threads
25%               96.30             +0.2%      96.50            schbench.latency_90%_us
                  (0.66%)                      (0.52%)          stddev
50%               187.20            -3.0%      181.60           schbench.latency_90%_us
                  (0.21%)                      (0.71%)          stddev
75%                3034             -84.1%     482.50           schbench.latency_90%_us
                  (5.56%)                      (27.40%)         stddev
100%              648.20            +114.7%    1391             schbench.latency_90%_us
                  (64.70%)                     (10.05%)         stddev
125%              3506              -3.0%      3400             schbench.latency_90%_us
                  (2.79%)                      (9.89%)          stddev
150%              6793              +29.6%     8803             schbench.latency_90%_us
                  (1.39%)                      (7.30%)          stddev
175%               9961             +9.2%      10876            schbench.latency_90%_us
                  (1.51%)                      (6.54%)          stddev
200%              13660             +3.3%      14118            schbench.latency_90%_us
                  (1.38%)                      (6.02%)          stddev



Summary for schbench: in most cases eevdf+NO_PLACE_BONUS gives the best performance.
And this is aligned with the previous test on another platform with smaller number of
CPUs, eevdf benefits schbench overall.

---------------------------------------------------------------------------------------



hackbench: ipc=pipe mode=process default fd:20

                   baseline                     eevdf+NO_PLACE_BONUS
worker_threads
1                  103103            -0.3%     102794        hackbench.throughput_avg
25%                115562          +825.7%    1069725        hackbench.throughput_avg
50%                296514          +352.1%    1340414        hackbench.throughput_avg
75%                498059          +190.8%    1448156        hackbench.throughput_avg
100%               804560           +74.8%    1406413        hackbench.throughput_avg


                   eevdf+NO_PLACE_BONUS        eevdf+PLACE_BONUS
worker_threads
1                  102172            +1.5%     103661         hackbench.throughput_avg
25%                1076503           -52.8%     508612        hackbench.throughput_avg
50%                1394311           -68.2%     443251        hackbench.throughput_avg
75%                1476502           -70.2%     440391        hackbench.throughput_avg
100%               1512706           -76.2%     359741        hackbench.throughput_avg


Summary for hackbench pipe process test: in most cases eevdf+NO_PLACE_BONUS gives the best performance.

-------------------------------------------------------------------------------------
unixbench: test=pipe

                   baseline                     eevdf+NO_PLACE_BONUS
nr_task
1                  1405              -0.5%       1398        unixbench.score
25%                77942             +0.9%      78680        unixbench.score
50%                155384            +1.1%     157100        unixbench.score
75%                179756            +0.3%     180295        unixbench.score
100%               204030            -0.2%     203540        unixbench.score
125%               204972            -0.4%     204062        unixbench.score
150%               205891            -0.5%     204792        unixbench.score
175%               207051            -0.5%     206047        unixbench.score
200%               209387            -0.9%     207559        unixbench.score


                   eevdf+NO_PLACE_BONUS        eevdf+PLACE_BONUS
nr_task
1                  1405              -0.3%       1401        unixbench.score
25%                78640             +0.0%      78647        unixbench.score
50%                157153            -0.0%     157093        unixbench.score
75%                180152            +0.0%     180205        unixbench.score
100%               203479            -0.0%     203464        unixbench.score
125%               203866            +0.1%     204013        unixbench.score
150%               204872            -0.0%     204838        unixbench.score
175%               205799            +0.0%     205824        unixbench.score
200%               207152            +0.2%     207546        unixbench.score

Seems to have no impact on unixbench in pipe mode.
--------------------------------------------------------------------------------

netperf: TCP_RR, ipv4, loopback

                   baseline                    eevdf+NO_PLACE_BONUS
nr_threads
25%                56232            -1.7%      55265        netperf.Throughput_tps
50%                49876            -3.1%      48338        netperf.Throughput_tps
75%                24281            +1.9%      24741        netperf.Throughput_tps
100%               73598            +3.8%      76375        netperf.Throughput_tps
125%               59119            +1.4%      59968        netperf.Throughput_tps
150%               49124            +1.2%      49727        netperf.Throughput_tps
175%               41929            +0.2%      42004        netperf.Throughput_tps
200%               36543            +0.4%      36677        netperf.Throughput_tps

                   eevdf+NO_PLACE_BONUS        eevdf+PLACE_BONUS
nr_threads
25%                55296            +4.7%      57877        netperf.Throughput_tps
50%                48659            +1.9%      49585        netperf.Throughput_tps
75%                24741            +0.3%      24807        netperf.Throughput_tps
100%               76455            +6.7%      81548        netperf.Throughput_tps
125%               60082            +7.6%      64622        netperf.Throughput_tps
150%               49618            +7.7%      53429        netperf.Throughput_tps
175%               41974            +7.6%      45160        netperf.Throughput_tps
200%               36677            +6.5%      39067        netperf.Throughput_tps

Seems to have no impact on netperf.
-----------------------------------------------------------------------------------

stress-ng: futex

                   baseline                     eevdf+NO_PLACE_BONUS
nr_threads
25%                207926           -21.0%     164356       stress-ng.futex.ops_per_sec
50%                46611           -16.1%      39130        stress-ng.futex.ops_per_sec
75%                71381           -11.3%      63283        stress-ng.futex.ops_per_sec
100%               58766            -0.8%      58269        stress-ng.futex.ops_per_sec
125%               59859           +11.3%      66645        stress-ng.futex.ops_per_sec
150%               52869            +7.6%      56863        stress-ng.futex.ops_per_sec
175%               49607           +22.9%      60969        stress-ng.futex.ops_per_sec
200%               56011           +11.8%      62631        stress-ng.futex.ops_per_sec


When the system is not busy, there is regression. When the system gets busier,
there are some improvement. Even with PLACE_BONUS enabled, there are still regression.
Per the perf profile of 50% case, there are nearly the same ratio of wakeup with vs without
eevdf patch applied:
50.82            -0.7       50.15        perf-profile.children.cycles-pp.futex_wake
but there are more preemption after eevdf enabled:
135095           +15.4%     155943        stress-ng.time.involuntary_context_switches
which is near the performance loss -16.1%
That is to say, eevdf help futex wakee grab the CPU easier(benefit latency), while might
have some impact on throughput?

thanks,
Chenyu

next prev parent reply	other threads:[~2023-04-13 15:43 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28  9:26 [PATCH 00/17] sched: EEVDF using latency-nice Peter Zijlstra
2023-03-28  9:26 ` [PATCH 01/17] sched: Introduce latency-nice as a per-task attribute Peter Zijlstra
2023-03-28  9:26 ` [PATCH 02/17] sched/fair: Add latency_offset Peter Zijlstra
2023-03-28  9:26 ` [PATCH 03/17] sched/fair: Add sched group latency support Peter Zijlstra
2023-03-28  9:26 ` [PATCH 04/17] sched/fair: Add avg_vruntime Peter Zijlstra
2023-03-28 23:57   ` Josh Don
2023-03-29  7:50     ` Peter Zijlstra
2023-04-05 19:13       ` Peter Zijlstra
2023-03-28  9:26 ` [PATCH 05/17] sched/fair: Remove START_DEBIT Peter Zijlstra
2023-03-28  9:26 ` [PATCH 06/17] sched/fair: Add lag based placement Peter Zijlstra
2023-04-03  9:18   ` Chen Yu
2023-04-05  9:47     ` Peter Zijlstra
2023-04-06  3:03       ` Chen Yu
2023-04-13 15:42       ` Chen Yu [this message]
2023-04-13 15:55         ` Chen Yu
2023-03-28  9:26 ` [PATCH 07/17] rbtree: Add rb_add_augmented_cached() helper Peter Zijlstra
2023-03-28  9:26 ` [PATCH 08/17] sched/fair: Implement an EEVDF like policy Peter Zijlstra
2023-03-29  1:26   ` Josh Don
2023-03-29  8:02     ` Peter Zijlstra
2023-03-29  8:06     ` Peter Zijlstra
2023-03-29  8:22       ` Peter Zijlstra
2023-03-29 18:48         ` Josh Don
2023-03-29  8:12     ` Peter Zijlstra
2023-03-29 18:54       ` Josh Don
2023-03-29  8:18     ` Peter Zijlstra
2023-03-29 14:35   ` Vincent Guittot
2023-03-30  8:01     ` Peter Zijlstra
2023-03-30 17:05       ` Vincent Guittot
2023-04-04 12:00         ` Peter Zijlstra
2023-03-28  9:26 ` [PATCH 09/17] sched: Commit to lag based placement Peter Zijlstra
2023-03-28  9:26 ` [PATCH 10/17] sched/smp: Use lag to simplify cross-runqueue placement Peter Zijlstra
2023-03-28  9:26 ` [PATCH 11/17] sched: Commit to EEVDF Peter Zijlstra
2023-03-28  9:26 ` [PATCH 12/17] sched/debug: Rename min_granularity to base_slice Peter Zijlstra
2023-03-28  9:26 ` [PATCH 13/17] sched: Merge latency_offset into slice Peter Zijlstra
2023-03-28  9:26 ` [PATCH 14/17] sched/eevdf: Better handle mixed slice length Peter Zijlstra
2023-03-31 15:26   ` Vincent Guittot
2023-04-04  9:29     ` Peter Zijlstra
2023-04-04 13:50       ` Joel Fernandes
2023-04-05  5:41         ` Mike Galbraith
2023-04-05  8:35         ` Peter Zijlstra
2023-04-05 20:05           ` Joel Fernandes
2023-04-14 11:18             ` Phil Auld
2023-04-16  5:10               ` Joel Fernandes
     [not found]   ` <20230401232355.336-1-hdanton@sina.com>
2023-04-02  2:40     ` Mike Galbraith
2023-03-28  9:26 ` [PATCH 15/17] [RFC] sched/eevdf: Sleeper bonus Peter Zijlstra
2023-03-29  9:10   ` Mike Galbraith
2023-03-28  9:26 ` [PATCH 16/17] [RFC] sched/eevdf: Minimal vavg option Peter Zijlstra
2023-03-28  9:26 ` [PATCH 17/17] [DEBUG] sched/eevdf: Debug / validation crud Peter Zijlstra
2023-04-03  7:42 ` [PATCH 00/17] sched: EEVDF using latency-nice Shrikanth Hegde
2023-04-10  3:13 ` David Vernet
2023-04-11  2:09   ` David Vernet
     [not found] ` <20230410082307.1327-1-hdanton@sina.com>
2023-04-11 10:15   ` Mike Galbraith
     [not found]   ` <20230411133333.1790-1-hdanton@sina.com>
2023-04-11 14:56     ` Mike Galbraith
     [not found]     ` <20230412025042.1413-1-hdanton@sina.com>
2023-04-12  4:05       ` Mike Galbraith
2023-04-25 12:32 ` Phil Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDgi6g4hRYCfbxcu@chenyu5-mobl1 \
    --to=yu.c.chen@intel.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=chris.hyser@oracle.com \
    --cc=corbet@lwn.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=efault@gmx.de \
    --cc=joel@joelfernandes.org \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=patrick.bellasi@matbug.net \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=qperret@google.com \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=timj@gnu.org \
    --cc=vincent.guittot@linaro.org \
    --cc=youssefesmat@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox