public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: modify move_tasks() to improve load balancing outcomes
@ 2006-04-13  6:57 Peter Williams
  2006-04-13 23:51 ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Williams @ 2006-04-13  6:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Con Kolivas, Chen, Kenneth W, Ingo Molnar, Mike Galbraith,
	Nick Piggin, Siddha, Suresh B, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1189 bytes --]

Problem:

The move_tasks() function is designed to move UP TO the amount of load 
it is asked to move and in doing this it skips over tasks looking for 
ones whose load weights are less than or equal to the remaining load to 
be moved.  This is (in general) a good thing but it has the unfortunate 
result of breaking one of the original load balancer's good points: 
namely, that (within the limits imposed by the active/expired array 
model and the fact the expired is processed first) it moves high 
priority tasks before low priority ones and this means there's a good 
chance (see active/expired problem for why it's only a chance) that the 
highest priority task on the queue but not actually on the CPU will be 
moved to the other CPU where (as a high priority task) it may preempt 
the current task.

Solution:

Modify move_tasks() so that high priority tasks are not skipped when 
moving them will make them the highest priority task on their new run queue.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

[-- Attachment #2: smpnice-modify-move_tasks --]
[-- Type: text/plain, Size: 1478 bytes --]

Index: MM-2.6.17-rc1-mm2/kernel/sched.c
===================================================================
--- MM-2.6.17-rc1-mm2.orig/kernel/sched.c	2006-04-13 10:53:32.000000000 +1000
+++ MM-2.6.17-rc1-mm2/kernel/sched.c	2006-04-13 11:08:45.000000000 +1000
@@ -2043,7 +2043,7 @@ static int move_tasks(runqueue_t *this_r
 {
 	prio_array_t *array, *dst_array;
 	struct list_head *head, *curr;
-	int idx, pulled = 0, pinned = 0;
+	int idx, pulled = 0, pinned = 0, this_min_prio;
 	long rem_load_move;
 	task_t *tmp;
 
@@ -2052,6 +2052,7 @@ static int move_tasks(runqueue_t *this_r
 
 	rem_load_move = max_load_move;
 	pinned = 1;
+	this_min_prio = this_rq->curr->prio;
 
 	/*
 	 * We first consider expired tasks. Those will likely not be
@@ -2091,7 +2092,12 @@ skip_queue:
 
 	curr = curr->prev;
 
-	if (tmp->load_weight > rem_load_move ||
+	/*
+	 * To help distribute high priority tasks accross CPUs we don't
+	 * skip a task if it will be the highest priority task (i.e. smallest
+	 * prio value) on its new queue regardless of its load weight
+	 */
+	if ((idx >= this_min_prio && tmp->load_weight > rem_load_move) ||
 	    !can_migrate_task(tmp, busiest, this_cpu, sd, idle, &pinned)) {
 		if (curr != head)
 			goto skip_queue;
@@ -2113,6 +2119,8 @@ skip_queue:
 	 * and the prescribed amount of weighted load.
 	 */
 	if (pulled < max_nr_move && rem_load_move > 0) {
+		if (idx < this_min_prio)
+			this_min_prio = idx;
 		if (curr != head)
 			goto skip_queue;
 		idx++;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-13  6:57 [PATCH] sched: modify move_tasks() to improve load balancing outcomes Peter Williams
@ 2006-04-13 23:51 ` Siddha, Suresh B
  2006-04-14  1:50   ` Peter Williams
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-04-13 23:51 UTC (permalink / raw)
  To: Peter Williams
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Siddha, Suresh B,
	Linux Kernel Mailing List

On Thu, Apr 13, 2006 at 04:57:15PM +1000, Peter Williams wrote:
> Problem:
> 
> The move_tasks() function is designed to move UP TO the amount of load 
> it is asked to move and in doing this it skips over tasks looking for 
> ones whose load weights are less than or equal to the remaining load to 
> be moved.  This is (in general) a good thing but it has the unfortunate 
> result of breaking one of the original load balancer's good points: 

with previous load balancer code it was a good point.. because all tasks
were weighted the same from load balancer perspective.. but now the
imbalance represents what task to move (atleast in the working
cases :)

> namely, that (within the limits imposed by the active/expired array 
> model and the fact the expired is processed first) it moves high 
> priority tasks before low priority ones and this means there's a good 
> chance (see active/expired problem for why it's only a chance) that the 
> highest priority task on the queue but not actually on the CPU will be 
> moved to the other CPU where (as a high priority task) it may preempt 
> the current task.
> 
> Solution:
> 
> Modify move_tasks() so that high priority tasks are not skipped when 
> moving them will make them the highest priority task on their new run queue.

you mean the highest priority task on the current active list of the new 
run queue, right?

There will be some unnecessary movements of high priority tasks because of
this...

thanks,
suresh

> 
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
> 
> -- 
> Peter Williams                                   pwil3058@bigpond.net.au
> 
> "Learning, n. The kind of ignorance distinguishing the studious."
>   -- Ambrose Bierce

> Index: MM-2.6.17-rc1-mm2/kernel/sched.c
> ===================================================================
> --- MM-2.6.17-rc1-mm2.orig/kernel/sched.c	2006-04-13 10:53:32.000000000 +1000
> +++ MM-2.6.17-rc1-mm2/kernel/sched.c	2006-04-13 11:08:45.000000000 +1000
> @@ -2043,7 +2043,7 @@ static int move_tasks(runqueue_t *this_r
>  {
>  	prio_array_t *array, *dst_array;
>  	struct list_head *head, *curr;
> -	int idx, pulled = 0, pinned = 0;
> +	int idx, pulled = 0, pinned = 0, this_min_prio;
>  	long rem_load_move;
>  	task_t *tmp;
>  
> @@ -2052,6 +2052,7 @@ static int move_tasks(runqueue_t *this_r
>  
>  	rem_load_move = max_load_move;
>  	pinned = 1;
> +	this_min_prio = this_rq->curr->prio;
>  
>  	/*
>  	 * We first consider expired tasks. Those will likely not be
> @@ -2091,7 +2092,12 @@ skip_queue:
>  
>  	curr = curr->prev;
>  
> -	if (tmp->load_weight > rem_load_move ||
> +	/*
> +	 * To help distribute high priority tasks accross CPUs we don't
> +	 * skip a task if it will be the highest priority task (i.e. smallest
> +	 * prio value) on its new queue regardless of its load weight
> +	 */
> +	if ((idx >= this_min_prio && tmp->load_weight > rem_load_move) ||
>  	    !can_migrate_task(tmp, busiest, this_cpu, sd, idle, &pinned)) {
>  		if (curr != head)
>  			goto skip_queue;
> @@ -2113,6 +2119,8 @@ skip_queue:
>  	 * and the prescribed amount of weighted load.
>  	 */
>  	if (pulled < max_nr_move && rem_load_move > 0) {
> +		if (idx < this_min_prio)
> +			this_min_prio = idx;
>  		if (curr != head)
>  			goto skip_queue;
>  		idx++;

Peter, Are you sure that this is a converging solution? If we want to

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-13 23:51 ` Siddha, Suresh B
@ 2006-04-14  1:50   ` Peter Williams
  2006-04-14 18:27     ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Williams @ 2006-04-14  1:50 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Linux Kernel Mailing List

Siddha, Suresh B wrote:
> On Thu, Apr 13, 2006 at 04:57:15PM +1000, Peter Williams wrote:
>> Problem:
>>
>> The move_tasks() function is designed to move UP TO the amount of load 
>> it is asked to move and in doing this it skips over tasks looking for 
>> ones whose load weights are less than or equal to the remaining load to 
>> be moved.  This is (in general) a good thing but it has the unfortunate 
>> result of breaking one of the original load balancer's good points: 
> 
> with previous load balancer code it was a good point.. because all tasks
> were weighted the same from load balancer perspective.. but now the
> imbalance represents what task to move (atleast in the working
> cases :)

That's the option 4 case in my original mail.  Are you suggesting that 
it would have been the better option to adopt?  If so, why?

I don't like it because (leaving aside the active/expired array issues) 
it would move the X top priority tasks across where X is the number of 
tasks required to meet the requirement.  If we could arrange for it to 
skip every second one without making the code too complicated (i.e. 
there'd be a possible need for multiple passes until the required load 
was moved).

> 
>> namely, that (within the limits imposed by the active/expired array 
>> model and the fact the expired is processed first) it moves high 
>> priority tasks before low priority ones and this means there's a good 
>> chance (see active/expired problem for why it's only a chance) that the 
>> highest priority task on the queue but not actually on the CPU will be 
>> moved to the other CPU where (as a high priority task) it may preempt 
>> the current task.
>>
>> Solution:
>>
>> Modify move_tasks() so that high priority tasks are not skipped when 
>> moving them will make them the highest priority task on their new run queue.
> 
> you mean the highest priority task on the current active list of the new 
> run queue, right?

Good point.  this_min_prio should probably be initialized to the minimum 
of this_rq->curr->prio and this_rq->best_expired_prio rather just using 
this_rq->curr->prio.

> 
> There will be some unnecessary movements of high priority tasks because of
> this...

How so.  At most one task per move_tasks() will be moved as a result of 
this code that wouldn't have been moved otherwise.  That task will be a 
high priority task stuck behind a higher priority task on its current 
queue that will be the highest priority on its new queue causing a 
preempt and access to the CPU.  I wouldn't call this unnecessary.

> Peter, Are you sure that this is a converging solution? If we want to

Yes, I think we're getting there.

I think we need changes to try_to_wake_up() to help high priority tasks 
find idle CPUs or CPUs where they would preempt when they wake up. 
Leaving this to the load balancer is bad for these tasks latencies.  I 
think that this change needs to be done independently of smpnice as it 
would be useful even without smpnice.  I'm hoping Ingo or Nick will 
comment on this proposal.

It would also help if you fixed the active load balance code so that 
it's not necessary to distort normal load balancing to accommodate it. 
I haven't had time to look at it myself (other than a quick glance) yet.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-14  1:50   ` Peter Williams
@ 2006-04-14 18:27     ` Siddha, Suresh B
  2006-04-15  0:54       ` Peter Williams
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-04-14 18:27 UTC (permalink / raw)
  To: Peter Williams
  Cc: Siddha, Suresh B, Andrew Morton, Con Kolivas, Chen, Kenneth W,
	Ingo Molnar, Mike Galbraith, Nick Piggin,
	Linux Kernel Mailing List

On Fri, Apr 14, 2006 at 11:50:10AM +1000, Peter Williams wrote:
> Siddha, Suresh B wrote:
> > On Thu, Apr 13, 2006 at 04:57:15PM +1000, Peter Williams wrote:
> >> Problem:
> >>
> >> The move_tasks() function is designed to move UP TO the amount of load 
> >> it is asked to move and in doing this it skips over tasks looking for 
> >> ones whose load weights are less than or equal to the remaining load to 
> >> be moved.  This is (in general) a good thing but it has the unfortunate 
> >> result of breaking one of the original load balancer's good points: 
> > 
> > with previous load balancer code it was a good point.. because all tasks
> > were weighted the same from load balancer perspective.. but now the
> > imbalance represents what task to move (atleast in the working
> > cases :)
> 
> That's the option 4 case in my original mail.  Are you suggesting that 
> it would have been the better option to adopt?  If so, why?

No. I was not suggesting option-4. With this change in move_tasks, we will 
be overriding the decision what ever we made while calculating imbalance.
Lets see for example, we have a simple DP system. With proc-0 running
one high priority and one low priority task, Proc-1 running one
low priority task. Ideally we would like to move low priority task from
P0 to P1. But with this patch, we may end up moving high priority task
from P0 to P1. But slowly after sometime(depending on high priority task
is on active/expired list), we will converge to the expected
solution..

> > you mean the highest priority task on the current active list of the new 
> > run queue, right?
> 
> Good point.  this_min_prio should probably be initialized to the minimum 
> of this_rq->curr->prio and this_rq->best_expired_prio rather just using 
> this_rq->curr->prio.

yes.

> 
> > 
> > There will be some unnecessary movements of high priority tasks because of
> > this...
> 
> How so.  At most one task per move_tasks() will be moved as a result of 
> this code that wouldn't have been moved otherwise.  That task will be a 
> high priority task stuck behind a higher priority task on its current 
> queue that will be the highest priority on its new queue causing a 
> preempt and access to the CPU.  I wouldn't call this unnecessary.

highest priority task can be in the expired list with normal priority
task running.. as in my above example.

> 
> > Peter, Are you sure that this is a converging solution? If we want to
> 
> Yes, I think we're getting there.
> 
> I think we need changes to try_to_wake_up() to help high priority tasks 
> find idle CPUs or CPUs where they would preempt when they wake up. 
> Leaving this to the load balancer is bad for these tasks latencies.  I 
> think that this change needs to be done independently of smpnice as it 
> would be useful even without smpnice.  I'm hoping Ingo or Nick will 
> comment on this proposal.
> 
> It would also help if you fixed the active load balance code so that 
> it's not necessary to distort normal load balancing to accommodate it. 
> I haven't had time to look at it myself (other than a quick glance) yet.

The only special check in find_busiest_group() helping MT/MC balancing
is pwr_now and pwr_move calculations.. These calculations will also help,
in future when we are dealing with sched groups which are not symmetric.
Asymmetries can be caused in scenarios like cpufreq, cpu logical hotplug..

I think we are unnecessarily behind active load balance...

thanks,
suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-14 18:27     ` Siddha, Suresh B
@ 2006-04-15  0:54       ` Peter Williams
  2006-04-17 16:59         ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Williams @ 2006-04-15  0:54 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Linux Kernel Mailing List

Siddha, Suresh B wrote:
> On Fri, Apr 14, 2006 at 11:50:10AM +1000, Peter Williams wrote:
>> Siddha, Suresh B wrote:
>>> On Thu, Apr 13, 2006 at 04:57:15PM +1000, Peter Williams wrote:
>>>> Problem:
>>>>
>>>> The move_tasks() function is designed to move UP TO the amount of load 
>>>> it is asked to move and in doing this it skips over tasks looking for 
>>>> ones whose load weights are less than or equal to the remaining load to 
>>>> be moved.  This is (in general) a good thing but it has the unfortunate 
>>>> result of breaking one of the original load balancer's good points: 
>>> with previous load balancer code it was a good point.. because all tasks
>>> were weighted the same from load balancer perspective.. but now the
>>> imbalance represents what task to move (atleast in the working
>>> cases :)
>> That's the option 4 case in my original mail.  Are you suggesting that 
>> it would have been the better option to adopt?  If so, why?
> 
> No. I was not suggesting option-4. With this change in move_tasks, we will 
> be overriding the decision what ever we made while calculating imbalance.
> Lets see for example, we have a simple DP system. With proc-0 running
> one high priority and one low priority task, Proc-1 running one
> low priority task. Ideally we would like to move low priority task from
> P0 to P1. But with this patch, we may end up moving high priority task
> from P0 to P1. But slowly after sometime(depending on high priority task
> is on active/expired list), we will converge to the expected
> solution..

Yes, there are problems with the active/expired arrays but they're no 
worse with smpnice than they are without it.

Going to a single priority array would eliminate this problem not to 
mention other problems to do with when to excuse tasks from being put on 
the expired array at the end of their time slice (see Mike Galbraith's 
patches).  But I haven't had much luck pushing that barrow. :-)

> 
>>> you mean the highest priority task on the current active list of the new 
>>> run queue, right?
>> Good point.  this_min_prio should probably be initialized to the minimum 
>> of this_rq->curr->prio and this_rq->best_expired_prio rather just using 
>> this_rq->curr->prio.
> 
> yes.
> 
>>> There will be some unnecessary movements of high priority tasks because of
>>> this...
>> How so.  At most one task per move_tasks() will be moved as a result of 
>> this code that wouldn't have been moved otherwise.  That task will be a 
>> high priority task stuck behind a higher priority task on its current 
>> queue that will be the highest priority on its new queue causing a 
>> preempt and access to the CPU.  I wouldn't call this unnecessary.
> 
> highest priority task can be in the expired list with normal priority
> task running.. as in my above example.

Yes, but this will get it onto a CPU sooner so it's not all bad.  Once 
again it's no worse than without smpnice.

I said in the patch description that there are active/expired array 
issues that make load balancing less than perfect with and without 
smpnice.  How far we go trying to eliminate them is the question.

If you have a better suggestion for how move_tasks() does its job, how 
about providing a patch and with supporting arguments as to why its 
better.  If it is better then we can use it.

> 
>>> Peter, Are you sure that this is a converging solution? If we want to
>> Yes, I think we're getting there.
>>
>> I think we need changes to try_to_wake_up() to help high priority tasks 
>> find idle CPUs or CPUs where they would preempt when they wake up. 
>> Leaving this to the load balancer is bad for these tasks latencies.  I 
>> think that this change needs to be done independently of smpnice as it 
>> would be useful even without smpnice.  I'm hoping Ingo or Nick will 
>> comment on this proposal.
>>
>> It would also help if you fixed the active load balance code so that 
>> it's not necessary to distort normal load balancing to accommodate it. 
>> I haven't had time to look at it myself (other than a quick glance) yet.
> 
> The only special check in find_busiest_group() helping MT/MC balancing
> is pwr_now and pwr_move calculations..

What about the very messy code I had to put in so that 
find_busiest_group() would return a group even if there were no queues 
in the group with more than one task.  Similar for find_busiest_queue().

> These calculations will also help,
> in future when we are dealing with sched groups which are not symmetric.
> Asymmetries can be caused in scenarios like cpufreq, cpu logical hotplug..

I meant for you to move the (highly speculative) active load balancing 
trigger out of load_balance() so that we can stop returning busiest 
groups and queues that have no hope of participating in successful load 
balancing.

> 
> I think we are unnecessarily behind active load balance...

I'm not sure what you mean here.  I'm not behind it at all.  I can see 
the need for HT systems to be able to move the only running task of a 
CPU to another CPU in another package but I think the implementation is 
appalling.  I think it needs a complete rewrite.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-15  0:54       ` Peter Williams
@ 2006-04-17 16:59         ` Siddha, Suresh B
  2006-04-17 23:21           ` Peter Williams
                             ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Siddha, Suresh B @ 2006-04-17 16:59 UTC (permalink / raw)
  To: Peter Williams
  Cc: Siddha, Suresh B, Andrew Morton, Con Kolivas, Chen, Kenneth W,
	Ingo Molnar, Mike Galbraith, Nick Piggin,
	Linux Kernel Mailing List

On Sat, Apr 15, 2006 at 10:54:45AM +1000, Peter Williams wrote:
> Yes, there are problems with the active/expired arrays but they're no 
> worse with smpnice than they are without it.

With smpnice, if we make any wrong decision while moving the high and
low priority tasks, we will enter an endless loop(as load balance
will keep on showing the imbalance and move_tasks will always move
the wrong tasks, instead of the correct ones pointed by imbalance..)

With the current code in the mainline, we don't have such a problem.

> Going to a single priority array would eliminate this problem not to 
> mention other problems to do with when to excuse tasks from being put on 
> the expired array at the end of their time slice (see Mike Galbraith's 
> patches).  But I haven't had much luck pushing that barrow. :-)

I will take a loot at those patches.

> If you have a better suggestion for how move_tasks() does its job, how 
> about providing a patch and with supporting arguments as to why its 
> better.  If it is better then we can use it.

I think we can have a second pass(if the first pass fails to move any),
in which we will not skip those tasks which will become highest priority
on the target runqueue...

> > The only special check in find_busiest_group() helping MT/MC balancing
> > is pwr_now and pwr_move calculations..
> 
> What about the very messy code I had to put in so that 
> find_busiest_group() would return a group even if there were no queues 
> in the group with more than one task.  Similar for find_busiest_queue().

Thats messy for sure and that was introduced by you to fix an imabalance
issue for a simple DP system(with out breaking HT systems). I will post a fix
for that.

thanks,
suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-17 16:59         ` Siddha, Suresh B
@ 2006-04-17 23:21           ` Peter Williams
  2006-04-17 23:56           ` Peter Williams
  2006-04-18  0:18           ` Peter Williams
  2 siblings, 0 replies; 9+ messages in thread
From: Peter Williams @ 2006-04-17 23:21 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Linux Kernel Mailing List

Siddha, Suresh B wrote:
> On Sat, Apr 15, 2006 at 10:54:45AM +1000, Peter Williams wrote:
>> Yes, there are problems with the active/expired arrays but they're no 
>>> The only special check in find_busiest_group() helping MT/MC balancing
>>> is pwr_now and pwr_move calculations..
>> What about the very messy code I had to put in so that 
>> find_busiest_group() would return a group even if there were no queues 
>> in the group with more than one task.  Similar for find_busiest_queue().
> 
> Thats messy for sure and that was introduced by you to fix an imabalance
> issue for a simple DP system(with out breaking HT systems). I will post a fix
> for that.

By moving the active load balance trigger out of load_balance() and 
providing an independent trigger that does not require sub optimal 
normal load balancing, I hope?

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-17 16:59         ` Siddha, Suresh B
  2006-04-17 23:21           ` Peter Williams
@ 2006-04-17 23:56           ` Peter Williams
  2006-04-18  0:18           ` Peter Williams
  2 siblings, 0 replies; 9+ messages in thread
From: Peter Williams @ 2006-04-17 23:56 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Linux Kernel Mailing List

Siddha, Suresh B wrote:
> On Sat, Apr 15, 2006 at 10:54:45AM +1000, Peter Williams wrote:
>> If you have a better suggestion for how move_tasks() does its job, how 
>> about providing a patch and with supporting arguments as to why its 
>> better.  If it is better then we can use it.
> 
> I think we can have a second pass(if the first pass fails to move any),
> in which we will not skip those tasks which will become highest priority
> on the target runqueue...

That won't solve the problem that this patch is intended to address.  As 
a reminder the problem is that a low priority task is being moved when 
we would prefer a high priority task to be moved.  I.e. we want to move 
the high priority task because it's the best one to move not because we 
couldn't move any tasks.

NB (ignoring races and can_migrate_task() vetoing a task being moved) 
there should always be a task that can be moved as the minimum value of 
imbalance is the average load per task of the source queue and there 
must be tasks whose load weight is less than or equal to the average 
(it's axiomatic).

To put that another way, only a change in the state of the source queue 
since imbalance was determined (possible because that's all done without 
locks) or can_migrate_task() will stop at least one task being moved.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: modify move_tasks() to improve load balancing outcomes
  2006-04-17 16:59         ` Siddha, Suresh B
  2006-04-17 23:21           ` Peter Williams
  2006-04-17 23:56           ` Peter Williams
@ 2006-04-18  0:18           ` Peter Williams
  2 siblings, 0 replies; 9+ messages in thread
From: Peter Williams @ 2006-04-18  0:18 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Andrew Morton, Con Kolivas, Chen, Kenneth W, Ingo Molnar,
	Mike Galbraith, Nick Piggin, Linux Kernel Mailing List

Siddha, Suresh B wrote:
> On Sat, Apr 15, 2006 at 10:54:45AM +1000, Peter Williams wrote:
>> Yes, there are problems with the active/expired arrays but they're no 
>> worse with smpnice than they are without it.
> 
> With smpnice, if we make any wrong decision while moving the high and
> low priority tasks, we will enter an endless loop(as load balance
> will keep on showing the imbalance and move_tasks will always move
> the wrong tasks, instead of the correct ones pointed by imbalance..)

This will only happen if the load weight for the task moved is larger 
than the difference between the weighted loads for the two queues.  So a 
check based on this observation can be used to prevent the loop.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-04-18  0:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-13  6:57 [PATCH] sched: modify move_tasks() to improve load balancing outcomes Peter Williams
2006-04-13 23:51 ` Siddha, Suresh B
2006-04-14  1:50   ` Peter Williams
2006-04-14 18:27     ` Siddha, Suresh B
2006-04-15  0:54       ` Peter Williams
2006-04-17 16:59         ` Siddha, Suresh B
2006-04-17 23:21           ` Peter Williams
2006-04-17 23:56           ` Peter Williams
2006-04-18  0:18           ` Peter Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox