public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Michael Wang <wangyun@linux.vnet.ibm.com>
Cc: mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de,
	akpm@linux-foundation.org, arjan@linux.intel.com, bp@alien8.de,
	pjt@google.com, namhyung@kernel.org, efault@gmx.de,
	morten.rasmussen@arm.com, vincent.guittot@linaro.org,
	gregkh@linuxfoundation.org, preeti@linux.vnet.ibm.com,
	viresh.kumar@linaro.org, linux-kernel@vger.kernel.org,
	len.brown@intel.com, rafael.j.wysocki@intel.com, jkosina@suse.cz,
	clark.williams@gmail.com, tony.luck@intel.com,
	keescook@chromium.org, mgorman@suse.de, riel@redhat.com
Subject: Re: [patch v3 0/8] sched: use runnable avg in load balance
Date: Wed, 03 Apr 2013 12:28:22 +0800	[thread overview]
Message-ID: <515BAFE6.1020804@intel.com> (raw)
In-Reply-To: <515BA0B7.2090906@linux.vnet.ibm.com>

On 04/03/2013 11:23 AM, Michael Wang wrote:
> On 04/03/2013 10:56 AM, Alex Shi wrote:
>> On 04/03/2013 10:46 AM, Michael Wang wrote:
>>> | 15 GB   |      16 | 45110 |   | 48091 |
>>> | 15 GB   |      24 | 41415 |   | 47415 |
>>> | 15 GB   |      32 | 35988 |   | 45749 |	+27.12%
>>>
>>> Very nice improvement, I'd like to test it with the wake-affine throttle
>>> patch later, let's see what will happen ;-)
>>>
>>> Any idea on why the last one caused the regression?
>>
>> you can change the burst threshold: sysctl_sched_migration_cost, to see
>> what's happen with different value. create a similar knob and tune it.
>> +
>> +	if (cpu_rq(this_cpu)->avg_idle < sysctl_sched_migration_cost)
>> +		burst_this = 1;
>> +	if (cpu_rq(prev_cpu)->avg_idle < sysctl_sched_migration_cost)
>> +		burst_prev = 1;
>> +
>>
>>
> 
> This changing the rate of adopt cpu_rq(cpu)->load.weight, correct?
> 
> So if rq is busy, cpu_rq(cpu)->load.weight is capable enough to stand
> for the load status of rq? what's the really idea here?

This patch try to resolved the aim7 liked benchmark regression.
If many tasks sleep long time, their runnable load are zero. And then if 
they are waked up bursty, too light runnable load causes big imbalance in
 select_task_rq. So such benchmark, like aim9 drop 5~7%.

this patch try to detect the burst, if so, it use load weight directly not
 zero runnable load avg to avoid the imbalance.

but the patch may cause some unfairness if this/prev cpu are not burst at 
same time. So could like try the following patch?


>From 4722a7567dccfb19aa5afbb49982ffb6d65e6ae5 Mon Sep 17 00:00:00 2001
From: Alex Shi <alex.shi@intel.com>
Date: Tue, 2 Apr 2013 10:27:45 +0800
Subject: [PATCH] sched: use instant load for burst wake up

If many tasks sleep long time, their runnable load are zero. And if they
are waked up bursty, too light runnable load causes big imbalance among
CPU. So such benchmark, like aim9 drop 5~7%.

With this patch the losing is covered, and even is slight better.

Signed-off-by: Alex Shi <alex.shi@intel.com>
---
 kernel/sched/fair.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index dbaa8ca..25ac437 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3103,12 +3103,24 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 	unsigned long weight;
 	int balanced;
 	int runnable_avg;
+	int burst = 0;
 
 	idx	  = sd->wake_idx;
 	this_cpu  = smp_processor_id();
 	prev_cpu  = task_cpu(p);
-	load	  = source_load(prev_cpu, idx);
-	this_load = target_load(this_cpu, idx);
+
+	if (cpu_rq(this_cpu)->avg_idle < sysctl_sched_migration_cost ||
+		cpu_rq(prev_cpu)->avg_idle < sysctl_sched_migration_cost)
+		burst= 1;
+
+	/* use instant load for bursty waking up */
+	if (!burst) {
+		load = source_load(prev_cpu, idx);
+		this_load = target_load(this_cpu, idx);
+	} else {
+		load = cpu_rq(prev_cpu)->load.weight;
+		this_load = cpu_rq(this_cpu)->load.weight;
+	}
 
 	/*
 	 * If sync wakeup then subtract the (maximum possible)
-- 
1.7.12


  reply	other threads:[~2013-04-03  4:28 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02  3:23 [patch v3 0/8] sched: use runnable avg in load balance Alex Shi
2013-04-02  3:23 ` [patch v3 1/8] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2013-04-02  3:23 ` [patch v3 2/8] sched: set initial value of runnable avg for new forked task Alex Shi
2013-04-02  3:23 ` [patch v3 3/8] sched: only count runnable avg on cfs_rq's nr_running Alex Shi
2013-04-03  3:19   ` Alex Shi
2013-04-02  3:23 ` [patch v3 4/8] sched: update cpu load after task_tick Alex Shi
2013-04-02  3:23 ` [patch v3 5/8] sched: compute runnable load avg in cpu_load and cpu_avg_load_per_task Alex Shi
2013-04-02  3:23 ` [patch v3 6/8] sched: consider runnable load average in move_tasks Alex Shi
2013-04-09  7:08   ` Vincent Guittot
2013-04-09  8:05     ` Alex Shi
2013-04-09  8:58       ` Vincent Guittot
2013-04-09 10:38         ` Alex Shi
2013-04-09 11:56           ` Vincent Guittot
2013-04-09 14:48             ` Alex Shi
2013-04-09 15:16               ` Vincent Guittot
2013-04-10  2:31                 ` Alex Shi
2013-04-10  6:07     ` Michael Wang
2013-04-10  6:55       ` Vincent Guittot
2013-04-02  3:23 ` [patch v3 7/8] sched: consider runnable load average in effective_load Alex Shi
2013-04-02  3:23 ` [patch v3 8/8] sched: use instant load for burst wake up Alex Shi
2013-04-02  7:23 ` [patch v3 0/8] sched: use runnable avg in load balance Michael Wang
2013-04-02  8:34   ` Mike Galbraith
2013-04-02  9:13     ` Michael Wang
2013-04-02  8:35   ` Alex Shi
2013-04-02  9:45     ` Michael Wang
2013-04-03  2:46     ` Michael Wang
2013-04-03  2:56       ` Alex Shi
2013-04-03  3:23         ` Michael Wang
2013-04-03  4:28           ` Alex Shi [this message]
2013-04-03  5:38             ` Michael Wang
2013-04-03  5:53               ` Michael Wang
2013-04-03  6:01               ` Alex Shi
2013-04-03  6:22             ` Michael Wang
2013-04-03  6:53               ` Alex Shi
2013-04-03  7:18                 ` Michael Wang
2013-04-03  7:28                   ` Alex Shi
2013-04-03  8:46   ` Alex Shi
2013-04-03  9:37     ` Michael Wang
2013-04-03 11:17       ` Alex Shi
2013-04-07  3:09     ` Michael Wang
2013-04-07  7:30       ` Alex Shi
2013-04-07  8:56         ` Michael Wang
2013-04-09  5:08 ` Alex Shi
2013-04-10 13:12   ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515BAFE6.1020804@intel.com \
    --to=alex.shi@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=clark.williams@gmail.com \
    --cc=efault@gmx.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=wangyun@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox