* [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task
@ 2008-09-05 12:30 Gautham R Shenoy
2008-09-05 15:13 ` Peter Zijlstra
2008-09-12 15:52 ` Chris Friesen
0 siblings, 2 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-09-05 12:30 UTC (permalink / raw)
To: Peter Zijlstra, Vaidyanathan Srinivasan, Balbir Singh,
Ingo Molnar
Cc: linux-kernel, Dipankar Sarma
sched: Fix __load_balance_iterator() for cfq with only one task.
From: Gautham R Shenoy <ego@in.ibm.com>
The __load_balance_iterator() returns a NULL when there's only one
sched_entity which is a task. It is caused by the following code-path.
/* Skip over entities that are not tasks */
do {
se = list_entry(next, struct sched_entity, group_node);
next = next->next;
} while (next != &cfs_rq->tasks && !entity_is_task(se));
if (next == &cfs_rq->tasks)
return NULL;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This will return NULL even when se is a task.
As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
since iter_move_one_task() when it calls load_balance_start_fair(),
would not get any tasks to move!
Fix this by checking if the last entity was a task or not.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
kernel/sched_fair.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index fb8994c..f1c96e3 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next)
next = next->next;
} while (next != &cfs_rq->tasks && !entity_is_task(se));
- if (next == &cfs_rq->tasks)
+ if (next == &cfs_rq->tasks && !entity_is_task(se))
return NULL;
cfs_rq->balance_iterator = next;
--
Thanks and Regards
gautham
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy @ 2008-09-05 15:13 ` Peter Zijlstra 2008-09-05 17:23 ` Peter Zijlstra 2008-09-12 15:52 ` Chris Friesen 1 sibling, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2008-09-05 15:13 UTC (permalink / raw) To: ego Cc: Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote: > sched: Fix __load_balance_iterator() for cfq with only one task. > > From: Gautham R Shenoy <ego@in.ibm.com> > > The __load_balance_iterator() returns a NULL when there's only one > sched_entity which is a task. It is caused by the following code-path. > > > /* Skip over entities that are not tasks */ > do { > se = list_entry(next, struct sched_entity, group_node); > next = next->next; > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > if (next == &cfs_rq->tasks) > return NULL; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > This will return NULL even when se is a task. > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25, > since iter_move_one_task() when it calls load_balance_start_fair(), > would not get any tasks to move! > > Fix this by checking if the last entity was a task or not. Gregory did a similar fix a while ago, but that caused grief of some kind.. Greg, can you recollect why we pulled it? I can't seem to find it. Aside from that this patch looks fine.. > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> > Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > Cc: Ingo Molnar <mingo@elte.hu> > --- > > kernel/sched_fair.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c > index fb8994c..f1c96e3 100644 > --- a/kernel/sched_fair.c > +++ b/kernel/sched_fair.c > @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next) > next = next->next; > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > - if (next == &cfs_rq->tasks) > + if (next == &cfs_rq->tasks && !entity_is_task(se)) > return NULL; > > cfs_rq->balance_iterator = next; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-05 15:13 ` Peter Zijlstra @ 2008-09-05 17:23 ` Peter Zijlstra 2008-09-12 6:35 ` Gautham R Shenoy 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2008-09-05 17:23 UTC (permalink / raw) To: ego Cc: Mike Galbraith, Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote: > On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote: > > sched: Fix __load_balance_iterator() for cfq with only one task. > > > > From: Gautham R Shenoy <ego@in.ibm.com> > > > > The __load_balance_iterator() returns a NULL when there's only one > > sched_entity which is a task. It is caused by the following code-path. > > > > > > /* Skip over entities that are not tasks */ > > do { > > se = list_entry(next, struct sched_entity, group_node); > > next = next->next; > > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > > > if (next == &cfs_rq->tasks) > > return NULL; > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This will return NULL even when se is a task. > > > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25, > > since iter_move_one_task() when it calls load_balance_start_fair(), > > would not get any tasks to move! > > > > Fix this by checking if the last entity was a task or not. > > Gregory did a similar fix a while ago, but that caused grief of some > kind.. > > Greg, can you recollect why we pulled it? I can't seem to find it. Gregory pointed me to this thread: http://lkml.org/lkml/2008/8/11/81 ego, can you run sysbench to confirm? > Aside from that this patch looks fine.. > > > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> > > Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > > Cc: Ingo Molnar <mingo@elte.hu> > > --- > > > > kernel/sched_fair.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > > > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c > > index fb8994c..f1c96e3 100644 > > --- a/kernel/sched_fair.c > > +++ b/kernel/sched_fair.c > > @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next) > > next = next->next; > > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > > > - if (next == &cfs_rq->tasks) > > + if (next == &cfs_rq->tasks && !entity_is_task(se)) > > return NULL; > > > > cfs_rq->balance_iterator = next; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-05 17:23 ` Peter Zijlstra @ 2008-09-12 6:35 ` Gautham R Shenoy 2008-09-12 6:56 ` Peter Zijlstra ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Gautham R Shenoy @ 2008-09-12 6:35 UTC (permalink / raw) To: Peter Zijlstra, Mike Galbraith Cc: Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote: > On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote: > > On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote: > > > sched: Fix __load_balance_iterator() for cfq with only one task. > > > > > > From: Gautham R Shenoy <ego@in.ibm.com> > > > > > > The __load_balance_iterator() returns a NULL when there's only one > > > sched_entity which is a task. It is caused by the following code-path. > > > > > > > > > /* Skip over entities that are not tasks */ > > > do { > > > se = list_entry(next, struct sched_entity, group_node); > > > next = next->next; > > > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > > > > > if (next == &cfs_rq->tasks) > > > return NULL; > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > This will return NULL even when se is a task. > > > > > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25, > > > since iter_move_one_task() when it calls load_balance_start_fair(), > > > would not get any tasks to move! > > > > > > Fix this by checking if the last entity was a task or not. > > > > Gregory did a similar fix a while ago, but that caused grief of some > > kind.. > > > > Greg, can you recollect why we pulled it? I can't seem to find it. > > Gregory pointed me to this thread: > > http://lkml.org/lkml/2008/8/11/81 > > ego, can you run sysbench to confirm? Am planning to run it today. Mike, with what --oltp-* mode did you run the sysbench test? That aside, if Mike's analysis is correct regarding the client/server pairs not running on the same CPU as buddies, shouldn't this be fixed in a higher level routine rather than have this anomaly in __load_balancer_iterator(), which is supposed to return the runnable tasks in the cfs_rq ? It's current behavior is that __load_balancer_iterator() will return NULL even if the last entity in the list is a runnable task. This behavior clearly hinders sched_mc powersavings from migrating a sole remaining task from a powersavings-sched_domain in-order to evacuate that domain and put all the CPUs of the domain into a low-power state. > > > Aside from that this patch looks fine.. > > > > > Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> > > > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> > > > Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > > > Cc: Ingo Molnar <mingo@elte.hu> > > > --- > > > > > > kernel/sched_fair.c | 2 +- > > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > > > > > > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c > > > index fb8994c..f1c96e3 100644 > > > --- a/kernel/sched_fair.c > > > +++ b/kernel/sched_fair.c > > > @@ -1451,7 +1451,7 @@ __load_balance_iterator(struct cfs_rq *cfs_rq, struct list_head *next) > > > next = next->next; > > > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > > > > > - if (next == &cfs_rq->tasks) > > > + if (next == &cfs_rq->tasks && !entity_is_task(se)) > > > return NULL; > > > > > > cfs_rq->balance_iterator = next; > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > -- Thanks and Regards gautham ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-12 6:35 ` Gautham R Shenoy @ 2008-09-12 6:56 ` Peter Zijlstra 2008-09-12 7:05 ` Mike Galbraith 2008-09-12 10:57 ` Mike Galbraith 2 siblings, 0 replies; 9+ messages in thread From: Peter Zijlstra @ 2008-09-12 6:56 UTC (permalink / raw) To: ego Cc: Mike Galbraith, Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote: > On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote: > > On Fri, 2008-09-05 at 17:13 +0200, Peter Zijlstra wrote: > > > On Fri, 2008-09-05 at 18:00 +0530, Gautham R Shenoy wrote: > > > > sched: Fix __load_balance_iterator() for cfq with only one task. > > > > > > > > From: Gautham R Shenoy <ego@in.ibm.com> > > > > > > > > The __load_balance_iterator() returns a NULL when there's only one > > > > sched_entity which is a task. It is caused by the following code-path. > > > > > > > > > > > > /* Skip over entities that are not tasks */ > > > > do { > > > > se = list_entry(next, struct sched_entity, group_node); > > > > next = next->next; > > > > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > > > > > > > if (next == &cfs_rq->tasks) > > > > return NULL; > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > This will return NULL even when se is a task. > > > > > > > > As a side-effect, there was a regression in sched_mc behavior since 2.6.25, > > > > since iter_move_one_task() when it calls load_balance_start_fair(), > > > > would not get any tasks to move! > > > > > > > > Fix this by checking if the last entity was a task or not. > > > > > > Gregory did a similar fix a while ago, but that caused grief of some > > > kind.. > > > > > > Greg, can you recollect why we pulled it? I can't seem to find it. > > > > Gregory pointed me to this thread: > > > > http://lkml.org/lkml/2008/8/11/81 > > > > ego, can you run sysbench to confirm? > > Am planning to run it today. > > Mike, with what --oltp-* mode did you run the sysbench test? > > That aside, if Mike's analysis is correct regarding the client/server > pairs not running on the same CPU as buddies, shouldn't this be fixed in a > higher level routine rather than have this anomaly in > __load_balancer_iterator(), which is supposed to return the runnable > tasks in the cfs_rq ? > > It's current behavior is that __load_balancer_iterator() will > return NULL even if the last entity in the list is a runnable task. > > This behavior clearly hinders sched_mc powersavings from migrating > a sole remaining task from a powersavings-sched_domain in-order > to evacuate that domain and put all the CPUs of the domain into a > low-power state. Sure - there is buddy_hot in task_hot() to avoid moving buddies, and I think we should do something like this: @@ -590,7 +602,7 @@ account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se) add_cfs_task_weight(cfs_rq, se->load.weight); cfs_rq->nr_running++; se->on_rq = 1; - list_add(&se->group_node, &cfs_rq->tasks); + list_add_tail(&se->group_node, &cfs_rq->tasks); } static void (most likely whitespace damaged) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-12 6:35 ` Gautham R Shenoy 2008-09-12 6:56 ` Peter Zijlstra @ 2008-09-12 7:05 ` Mike Galbraith 2008-09-12 10:57 ` Mike Galbraith 2 siblings, 0 replies; 9+ messages in thread From: Mike Galbraith @ 2008-09-12 7:05 UTC (permalink / raw) To: ego Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri [-- Attachment #1: Type: text/plain, Size: 188 bytes --] On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote: > Mike, with what --oltp-* mode did you run the sysbench test? My cobbled together via google test script is attached. -Mike [-- Attachment #2: sysbench.test --] [-- Type: application/x-shellscript, Size: 2351 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-12 6:35 ` Gautham R Shenoy 2008-09-12 6:56 ` Peter Zijlstra 2008-09-12 7:05 ` Mike Galbraith @ 2008-09-12 10:57 ` Mike Galbraith 2008-09-12 11:07 ` Gautham R Shenoy 2 siblings, 1 reply; 9+ messages in thread From: Mike Galbraith @ 2008-09-12 10:57 UTC (permalink / raw) To: ego Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote: > On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote: > > ego, can you run sysbench to confirm? > > Am planning to run it today. I just tested rc6 with 6d299f1b53b84e2665f402d9bcc494800aba6386 applied, and it does not exhibit the problem which triggered the revert, so I'd suggest reverting it. (Perhaps more clock manifestations) -Mike ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-12 10:57 ` Mike Galbraith @ 2008-09-12 11:07 ` Gautham R Shenoy 0 siblings, 0 replies; 9+ messages in thread From: Gautham R Shenoy @ 2008-09-12 11:07 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, Gregory Haskins, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma, Srivatsa Vaddagiri On Fri, Sep 12, 2008 at 12:57:08PM +0200, Mike Galbraith wrote: > On Fri, 2008-09-12 at 12:05 +0530, Gautham R Shenoy wrote: > > On Fri, Sep 05, 2008 at 07:23:44PM +0200, Peter Zijlstra wrote: > > > > ego, can you run sysbench to confirm? > > > > Am planning to run it today. > > I just tested rc6 with 6d299f1b53b84e2665f402d9bcc494800aba6386 applied, > and it does not exhibit the problem which triggered the revert, so I'd > suggest reverting it. (Perhaps more clock manifestations) That's good news! Thanks for doing this > > -Mike > -- Thanks and Regards gautham ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task 2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy 2008-09-05 15:13 ` Peter Zijlstra @ 2008-09-12 15:52 ` Chris Friesen 1 sibling, 0 replies; 9+ messages in thread From: Chris Friesen @ 2008-09-12 15:52 UTC (permalink / raw) To: ego Cc: Peter Zijlstra, Vaidyanathan Srinivasan, Balbir Singh, Ingo Molnar, linux-kernel, Dipankar Sarma Gautham R Shenoy wrote: > sched: Fix __load_balance_iterator() for cfq with only one task. > > From: Gautham R Shenoy <ego@in.ibm.com> > > The __load_balance_iterator() returns a NULL when there's only one > sched_entity which is a task. It is caused by the following code-path. > > > /* Skip over entities that are not tasks */ > do { > se = list_entry(next, struct sched_entity, group_node); > next = next->next; > } while (next != &cfs_rq->tasks && !entity_is_task(se)); > > if (next == &cfs_rq->tasks) > return NULL; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > This will return NULL even when se is a task. Thank you! I'd been looking suspiciously at this routine as well due to strange load-balancing behaviour that I saw while testing the fair group code, but I hadn't yet tracked down the exact problem. Peter/Ingo, this appears to explain the issues described in the mail I sent on the 4th. After applying this change the imbalance between tasks in the same group is substantially reduced. Chris ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-09-12 15:53 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-05 12:30 [PATCH] sched: Fix __load_balance_iterator() for cfq with only one task Gautham R Shenoy 2008-09-05 15:13 ` Peter Zijlstra 2008-09-05 17:23 ` Peter Zijlstra 2008-09-12 6:35 ` Gautham R Shenoy 2008-09-12 6:56 ` Peter Zijlstra 2008-09-12 7:05 ` Mike Galbraith 2008-09-12 10:57 ` Mike Galbraith 2008-09-12 11:07 ` Gautham R Shenoy 2008-09-12 15:52 ` Chris Friesen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox