Re: [PATCH] sched/fair: Do not decay new task load on first enqueue

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matt Fleming <matt@codeblueprint.co.uk>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Wanpeng Li <kernellwp@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	Yuyang Du <yuyang.du@intel.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [PATCH] sched/fair: Do not decay new task load on first enqueue
Date: Tue, 11 Oct 2016 11:24:53 +0100	[thread overview]
Message-ID: <20161011102453.GA16071@codeblueprint.co.uk> (raw)
In-Reply-To: <20161010173440.GA28945@linaro.org>

On Mon, 10 Oct, at 07:34:40PM, Vincent Guittot wrote:
> 
> Subject: [PATCH] sched: use load_avg for selecting idlest group
> 
> select_busiest_group only compares the runnable_load_avg when looking for
> the idlest group. But on fork intensive use case like hackbenchw here task
> blocked quickly after the fork, this can lead to selecting the same CPU
> whereas other CPUs, which have similar runnable load but a lower load_avg,
> could be chosen instead.
> 
> When the runnable_load_avg of 2 CPUs are close, we now take into account
> the amount of blocked load as a 2nd selection factor.
> 
> For use case like hackbench, this enable the scheduler to select different
> CPUs during the fork sequence and to spread tasks across the system.
> 
> Tests have been done on a Hikey board (ARM based octo cores) for several
> kernel. The result below gives min, max, avg and stdev values of 18 runs
> with each configuration.
> 
> The v4.8+patches configuration also includes the changes below which is part of the
> proposal made by Peter to ensure that the clock will be up to date when the
> fork task will be attached to the rq.
> 
> @@ -2568,6 +2568,7 @@ void wake_up_new_task(struct task_struct *p)
>  	__set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0));
>  #endif
>  	rq = __task_rq_lock(p, &rf);
> +	update_rq_clock(rq);
>  	post_init_entity_util_avg(&p->se);
>  
>  	activate_task(rq, p, 0);
> 
> hackbench -P -g 1 
> 
>        ea86cb4b7621  7dc603c9028e  v4.8        v4.8+patches
> min    0.049         0.050         0.051       0,048
> avg    0.057         0.057(0%)     0.057(0%)   0,055(+5%)
> max    0.066         0.068         0.070       0,063
> stdev  +/-9%         +/-9%         +/-8%       +/-9%
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 40 ++++++++++++++++++++++++++++++++--------
>  1 file changed, 32 insertions(+), 8 deletions(-)

This patch looks pretty good to me and this 2-socket 48-cpu Xeon
(domain0 SMT, domain1 MC, domain2 NUMA) shows a few nice performance
improvements, and no regressions for various combinations of hackbench
sockets/pipes and group numbers.

But on a 2-socket 8-cpu Xeon (domain0 MC, domain1 DIE) running,

  perf stat --null -r 25 -- hackbench -pipe 30 process 1000

I see a regression,

  baseline: 2.41228
  patched : 2.64528 (-9.7%)

Even though the spread of tasks during fork[0] is improved,

  baseline CV: 0.478%
  patched CV : 0.042%

Clearly the spread wasn't *that* bad to begin with on this machine for
this workload. I consider the baseline spread to be pretty well
distributed. Some other factor must be at play.

Patched runqueue latencies are higher (max9* are percentiles),

  baseline: mean: 615932.69 max90: 75272.00 max95: 175985.00 max99: 5884778.00 max: 1694084747.00 
  patched: mean : 882026.28 max90: 92015.00 max95: 291760.00 max99: 7590167.00 max: 1841154776.00

And there are more migrations of hackbench tasks,

  baseline: total: 5390 cross-MC: 3810 cross-DIE: 1580
  patched : total: 7222 cross-MC: 4591 cross-DIE: 2631
                 (+34.0%)       (+20.5%)        (+66.5%)

That's a lot more costly cross-DIE migrations. I think this patch is
along the right lines, but there's something fishy happening on this
box.

[0] - Fork task placement spread measurement:

      cat /tmp/trace.$1 | grep -E "wakeup_new.*comm=hackbench" | \
	sed -e 's/.*target_cpu=//' | sort | uniq -c | awk '{print $1}'

next prev parent reply	other threads:[~2016-10-11 10:25 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-23 11:58 [PATCH] sched/fair: Do not decay new task load on first enqueue Matt Fleming
2016-09-23 14:30 ` Vincent Guittot
2016-09-27 13:48   ` Dietmar Eggemann
2016-09-27 19:24     ` Matt Fleming
2016-09-27 19:21   ` Matt Fleming
2016-09-28 10:14 ` Peter Zijlstra
2016-09-28 11:06   ` Dietmar Eggemann
2016-09-28 11:19     ` Peter Zijlstra
2016-09-28 11:31       ` Dietmar Eggemann
2016-09-28 11:46         ` Vincent Guittot
2016-09-28 12:00           ` Vincent Guittot
2016-10-04 21:25             ` Matt Fleming
2016-10-04 20:16           ` Matt Fleming
2016-09-28 12:27         ` Vincent Guittot
2016-09-28 13:13           ` Vincent Guittot
2016-09-29 16:15             ` Dietmar Eggemann
2016-10-03 13:05               ` Vincent Guittot
2016-09-28 17:59       ` Dietmar Eggemann
2016-09-28 19:37   ` Matt Fleming
2016-09-30 20:30     ` Matt Fleming
2016-10-09  3:39     ` Wanpeng Li
2016-10-10 10:01       ` Matt Fleming
2016-10-10 10:09         ` Wanpeng Li
2016-10-11 10:27           ` Matt Fleming
2016-10-10 12:29         ` Vincent Guittot
2016-10-10 13:54           ` Dietmar Eggemann
2016-10-10 18:29             ` Vincent Guittot
2016-10-11  9:44               ` Dietmar Eggemann
2016-10-11 10:39                 ` Matt Fleming
2016-10-18 10:11                   ` Matt Fleming
2016-10-10 17:34           ` Vincent Guittot
2016-10-11 10:24             ` Matt Fleming [this message]
2016-10-11 13:14               ` Vincent Guittot
2016-10-11 18:57                 ` Matt Fleming
2016-10-12  7:41                   ` Vincent Guittot
2016-10-18 11:09                     ` Peter Zijlstra
2016-10-18 15:19                       ` Vincent Guittot
2016-10-18 10:29               ` Matt Fleming
2016-10-18 11:10                 ` Peter Zijlstra
2016-10-18 11:29                   ` Matt Fleming
2016-10-18 12:15                     ` Peter Zijlstra
2016-10-19  6:38                       ` Vincent Guittot
2016-10-19  9:53                         ` Peter Zijlstra
2016-11-09 16:53                           ` Vincent Guittot
2016-10-04 20:11   ` Matt Fleming
2016-10-09  5:57 ` [sched/fair] f54c5d4e28: hackbench.throughput 10.6% improvement kernel test robot
2016-10-09  5:57   ` [lkp] " kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161011102453.GA16071@codeblueprint.co.uk \
    --to=matt@codeblueprint.co.uk \
    --cc=dietmar.eggemann@arm.com \
    --cc=kernellwp@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=umgwanakikbuti@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.