public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
@ 2008-09-05 12:30 Gautham R Shenoy
  2008-09-05 15:13 ` Peter Zijlstra
  2008-09-12 15:52 ` Chris Friesen
  0 siblings, 2 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-09-05 12:30 UTC (permalink / raw)
  To: Peter Zijlstra, Vaidyanathan Srinivasan, Balbir Singh,
	Ingo Molnar
  Cc: linux-kernel, Dipankar Sarma

sched: Fix __load_balance_iterator() for cfq with only one task.

From: Gautham R Shenoy <ego@in.ibm.com>

The __load_balance_iterator() returns a NULL when there's only one
sched_entity which is a task. It is caused by the following code-path.


	/* Skip over entities that are not tasks */
	do {
		se = list_entry(next, struct sched_entity, group_node);
		next = next->next;
	} while (next != &cfs_rq->tasks && !entity_is_task(se));

	if (next == &cfs_rq->tasks)
		return NULL;
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      This will return NULL even when se is a task.

As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
since iter_move_one_task() when it calls load_balance_start_fair(),
would not get any tasks to move!

Fix this by checking if the last entity was a task or not.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
---

 kernel/sched_fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index fb8994c..f1c96e3 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
 		next = next->next;
 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
 
-	if (next == &cfs_rq->tasks)
+	if (next == &cfs_rq->tasks && !entity_is_task(se))
 		return NULL;
 
 	cfs_rq->balance_iterator = next;
-- 
Thanks and Regards
gautham

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy
@ 2008-09-05 15:13 ` Peter Zijlstra
  2008-09-05 17:23   ` Peter Zijlstra
  2008-09-12 15:52 ` Chris Friesen
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2008-09-05 15:13 UTC (permalink / raw)
  To: ego
  Cc: Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh,
	Ingo Molnar, linux-kernel, Dipankar Sarma

On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote:
> sched: Fix __load_balance_iterator() for cfq with only one task.
> 
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> The __load_balance_iterator() returns a NULL when there's only one
> sched_entity which is a task. It is caused by the following code-path.
> 
> 
> 	/* Skip over entities that are not tasks */
> 	do {
> 		se = list_entry(next, struct sched_entity, group_node);
> 		next = next->next;
> 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> 
> 	if (next == &cfs_rq->tasks)
> 		return NULL;
> 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>       This will return NULL even when se is a task.
> 
> As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
> since iter_move_one_task() when it calls load_balance_start_fair(),
> would not get any tasks to move!
> 
> Fix this by checking if the last entity was a task or not.

Gregory did a similar fix a while ago, but that caused grief of some
kind..

Greg, can you recollect why we pulled it? I can't seem to find it.

Aside from that this patch looks fine..

> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> ---
> 
>  kernel/sched_fair.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index fb8994c..f1c96e3 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
>  		next = next->next;
>  	} while (next != &cfs_rq->tasks && !entity_is_task(se));
>  
> -	if (next == &cfs_rq->tasks)
> +	if (next == &cfs_rq->tasks && !entity_is_task(se))
>  		return NULL;
>  
>  	cfs_rq->balance_iterator = next;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-05 15:13 ` Peter Zijlstra
@ 2008-09-05 17:23   ` Peter Zijlstra
  2008-09-12  6:35     ` Gautham R Shenoy
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2008-09-05 17:23 UTC (permalink / raw)
  To: ego
  Cc: Mike Galbraith, Gregory Haskins, Vaidyanathan Srinivasan,
	Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma

On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote:
> On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote:
> > sched: Fix __load_balance_iterator() for cfq with only one task.
> > 
> > From: Gautham R Shenoy <ego@in.ibm.com>
> > 
> > The __load_balance_iterator() returns a NULL when there's only one
> > sched_entity which is a task. It is caused by the following code-path.
> > 
> > 
> > 	/* Skip over entities that are not tasks */
> > 	do {
> > 		se = list_entry(next, struct sched_entity, group_node);
> > 		next = next->next;
> > 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> > 
> > 	if (next == &cfs_rq->tasks)
> > 		return NULL;
> > 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >       This will return NULL even when se is a task.
> > 
> > As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
> > since iter_move_one_task() when it calls load_balance_start_fair(),
> > would not get any tasks to move!
> > 
> > Fix this by checking if the last entity was a task or not.
> 
> Gregory did a similar fix a while ago, but that caused grief of some
> kind..
> 
> Greg, can you recollect why we pulled it? I can't seem to find it.

Gregory pointed me to this thread:

  http://lkml.org/lkml/2008/8/11/81

ego, can you run sysbench to confirm?

> Aside from that this patch looks fine..
> 
> > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> > Cc: Ingo Molnar <mingo@elte.hu>
> > ---
> > 
> >  kernel/sched_fair.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > 
> > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> > index fb8994c..f1c96e3 100644
> > --- a/kernel/sched_fair.c
> > +++ b/kernel/sched_fair.c
> > @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
> >  		next = next->next;
> >  	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> >  
> > -	if (next == &cfs_rq->tasks)
> > +	if (next == &cfs_rq->tasks && !entity_is_task(se))
> >  		return NULL;
> >  
> >  	cfs_rq->balance_iterator = next;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-05 17:23   ` Peter Zijlstra
@ 2008-09-12  6:35     ` Gautham R Shenoy
  2008-09-12  6:56       ` Peter Zijlstra
                         ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-09-12  6:35 UTC (permalink / raw)
  To: Peter Zijlstra, Mike Galbraith
  Cc: Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh,
	Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri

On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote:
> On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote:
> > On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote:
> > > sched: Fix __load_balance_iterator() for cfq with only one task.
> > > 
> > > From: Gautham R Shenoy <ego@in.ibm.com>
> > > 
> > > The __load_balance_iterator() returns a NULL when there's only one
> > > sched_entity which is a task. It is caused by the following code-path.
> > > 
> > > 
> > > 	/* Skip over entities that are not tasks */
> > > 	do {
> > > 		se = list_entry(next, struct sched_entity, group_node);
> > > 		next = next->next;
> > > 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> > > 
> > > 	if (next == &cfs_rq->tasks)
> > > 		return NULL;
> > > 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >       This will return NULL even when se is a task.
> > > 
> > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
> > > since iter_move_one_task() when it calls load_balance_start_fair(),
> > > would not get any tasks to move!
> > > 
> > > Fix this by checking if the last entity was a task or not.
> > 
> > Gregory did a similar fix a while ago, but that caused grief of some
> > kind..
> > 
> > Greg, can you recollect why we pulled it? I can't seem to find it.
> 
> Gregory pointed me to this thread:
> 
>   http://lkml.org/lkml/2008/8/11/81
> 
> ego, can you run sysbench to confirm?

Am planning to run it today.

Mike, with what --oltp-* mode did you run the sysbench test?

That aside, if Mike's analysis is correct regarding the client/server
pairs not running on the same CPU as buddies, shouldn't this be fixed in a
higher level routine rather than have this anomaly in
__load_balancer_iterator(), which is supposed to return the runnable
tasks in the cfs_rq ?

It's current behavior is that __load_balancer_iterator() will
return NULL even if the last entity in the list is a runnable task.

This behavior clearly hinders sched_mc powersavings from migrating
a sole remaining task from a powersavings-sched_domain in-order
to evacuate that domain and put all the CPUs of the domain into a
low-power state.

> 
> > Aside from that this patch looks fine..
> > 
> > > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> > > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> > > Cc: Ingo Molnar <mingo@elte.hu>
> > > ---
> > > 
> > >  kernel/sched_fair.c |    2 +-
> > >  1 files changed, 1 insertions(+), 1 deletions(-)
> > > 
> > > 
> > > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> > > index fb8994c..f1c96e3 100644
> > > --- a/kernel/sched_fair.c
> > > +++ b/kernel/sched_fair.c
> > > @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
> > >  		next = next->next;
> > >  	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> > >  
> > > -	if (next == &cfs_rq->tasks)
> > > +	if (next == &cfs_rq->tasks && !entity_is_task(se))
> > >  		return NULL;
> > >  
> > >  	cfs_rq->balance_iterator = next;
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Thanks and Regards
gautham

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-12  6:35     ` Gautham R Shenoy
@ 2008-09-12  6:56       ` Peter Zijlstra
  2008-09-12  7:05       ` Mike Galbraith
  2008-09-12 10:57       ` Mike Galbraith
  2 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-09-12  6:56 UTC (permalink / raw)
  To: ego
  Cc: Mike Galbraith, Gregory Haskins, Vaidyanathan Srinivasan,
	Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma,
	Srivatsa Vaddagiri

On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote:
> On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote:
> > On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote:
> > > On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote:
> > > > sched: Fix __load_balance_iterator() for cfq with only one task.
> > > > 
> > > > From: Gautham R Shenoy <ego@in.ibm.com>
> > > > 
> > > > The __load_balance_iterator() returns a NULL when there's only one
> > > > sched_entity which is a task. It is caused by the following code-path.
> > > > 
> > > > 
> > > > 	/* Skip over entities that are not tasks */
> > > > 	do {
> > > > 		se = list_entry(next, struct sched_entity, group_node);
> > > > 		next = next->next;
> > > > 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> > > > 
> > > > 	if (next == &cfs_rq->tasks)
> > > > 		return NULL;
> > > > 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > >       This will return NULL even when se is a task.
> > > > 
> > > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
> > > > since iter_move_one_task() when it calls load_balance_start_fair(),
> > > > would not get any tasks to move!
> > > > 
> > > > Fix this by checking if the last entity was a task or not.
> > > 
> > > Gregory did a similar fix a while ago, but that caused grief of some
> > > kind..
> > > 
> > > Greg, can you recollect why we pulled it? I can't seem to find it.
> > 
> > Gregory pointed me to this thread:
> > 
> >   http://lkml.org/lkml/2008/8/11/81
> > 
> > ego, can you run sysbench to confirm?
> 
> Am planning to run it today.
> 
> Mike, with what --oltp-* mode did you run the sysbench test?
> 
> That aside, if Mike's analysis is correct regarding the client/server
> pairs not running on the same CPU as buddies, shouldn't this be fixed in a
> higher level routine rather than have this anomaly in
> __load_balancer_iterator(), which is supposed to return the runnable
> tasks in the cfs_rq ?
> 
> It's current behavior is that __load_balancer_iterator() will
> return NULL even if the last entity in the list is a runnable task.
> 
> This behavior clearly hinders sched_mc powersavings from migrating
> a sole remaining task from a powersavings-sched_domain in-order
> to evacuate that domain and put all the CPUs of the domain into a
> low-power state.

Sure - there is buddy_hot in task_hot() to avoid moving buddies, and I
think we should do something like this:

@@ -590,7 +602,7 @@ account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
                add_cfs_task_weight(cfs_rq, se->load.weight);
        cfs_rq->nr_running++;
        se->on_rq = 1;
-       list_add(&se->group_node, &cfs_rq->tasks);
+       list_add_tail(&se->group_node, &cfs_rq->tasks);
 }

 static void

(most likely whitespace damaged)




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-12  6:35     ` Gautham R Shenoy
  2008-09-12  6:56       ` Peter Zijlstra
@ 2008-09-12  7:05       ` Mike Galbraith
  2008-09-12 10:57       ` Mike Galbraith
  2 siblings, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2008-09-12  7:05 UTC (permalink / raw)
  To: ego
  Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan,
	Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma,
	Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 188 bytes --]

On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote:

> Mike, with what --oltp-* mode did you run the sysbench test?

My cobbled together via google test script is attached.

	-Mike

[-- Attachment #2: sysbench.test --]
[-- Type: application/x-shellscript, Size: 2351 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-12  6:35     ` Gautham R Shenoy
  2008-09-12  6:56       ` Peter Zijlstra
  2008-09-12  7:05       ` Mike Galbraith
@ 2008-09-12 10:57       ` Mike Galbraith
  2008-09-12 11:07         ` Gautham R Shenoy
  2 siblings, 1 reply; 9+ messages in thread
From: Mike Galbraith @ 2008-09-12 10:57 UTC (permalink / raw)
  To: ego
  Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan,
	Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma,
	Srivatsa Vaddagiri

On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote:
> On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote:

> > ego, can you run sysbench to confirm?
> 
> Am planning to run it today.

I just tested rc6 with 6d299f1b53b84e2665f402d9bcc494800aba6386 applied,
and it does not exhibit the problem which triggered the revert, so I'd
suggest reverting it.  (Perhaps more clock manifestations)

	-Mike


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-12 10:57       ` Mike Galbraith
@ 2008-09-12 11:07         ` Gautham R Shenoy
  0 siblings, 0 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-09-12 11:07 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan,
	Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma,
	Srivatsa Vaddagiri

On Fri, Sep 12, 2008 at 12:57:08PM +0200, Mike Galbraith wrote:
> On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote:
> > On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote:
> 
> > > ego, can you run sysbench to confirm?
> > 
> > Am planning to run it today.
> 
> I just tested rc6 with 6d299f1b53b84e2665f402d9bcc494800aba6386 applied,
> and it does not exhibit the problem which triggered the revert, so I'd
> suggest reverting it.  (Perhaps more clock manifestations)

That's good news!

Thanks for doing this
> 
> 	-Mike
> 

-- 
Thanks and Regards
gautham

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
  2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy
  2008-09-05 15:13 ` Peter Zijlstra
@ 2008-09-12 15:52 ` Chris Friesen
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Friesen @ 2008-09-12 15:52 UTC (permalink / raw)
  To: ego
  Cc: Peter Zijlstra, Vaidyanathan Srinivasan, Balbir Singh,
	Ingo Molnar, linux-kernel, Dipankar Sarma

Gautham R Shenoy wrote:
> sched: Fix __load_balance_iterator() for cfq with only one task.
> 
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> The __load_balance_iterator() returns a NULL when there's only one
> sched_entity which is a task. It is caused by the following code-path.
> 
> 
> 	/* Skip over entities that are not tasks */
> 	do {
> 		se = list_entry(next, struct sched_entity, group_node);
> 		next = next->next;
> 	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> 
> 	if (next == &cfs_rq->tasks)
> 		return NULL;
> 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>       This will return NULL even when se is a task.

Thank you!  I'd been looking suspiciously at this routine as well due to 
strange load-balancing behaviour that I saw while testing the fair group 
code, but I hadn't yet tracked down the exact problem.

Peter/Ingo, this appears to explain the issues described in the mail I 
sent on the 4th.  After applying this change the imbalance between tasks 
in the same group is substantially reduced.

Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-09-12 15:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy
2008-09-05 15:13 ` Peter Zijlstra
2008-09-05 17:23   ` Peter Zijlstra
2008-09-12  6:35     ` Gautham R Shenoy
2008-09-12  6:56       ` Peter Zijlstra
2008-09-12  7:05       ` Mike Galbraith
2008-09-12 10:57       ` Mike Galbraith
2008-09-12 11:07         ` Gautham R Shenoy
2008-09-12 15:52 ` Chris Friesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox