From: tip-bot for Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@redhat.com,
a.p.zijlstra@chello.nl, tglx@linutronix.de, mingo@elte.hu
Subject: [tip:sched/core] sched: Fix cgroup smp fairness
Date: Sun, 2 Aug 2009 13:12:22 GMT [thread overview]
Message-ID: <tip-a5004278f0525dcb9aa43703ef77bf371ea837cd@git.kernel.org> (raw)
In-Reply-To: <1248696557.6987.1615.camel@twins>
Commit-ID: a5004278f0525dcb9aa43703ef77bf371ea837cd
Gitweb: http://git.kernel.org/tip/a5004278f0525dcb9aa43703ef77bf371ea837cd
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Mon, 27 Jul 2009 14:04:49 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 2 Aug 2009 14:26:06 +0200
sched: Fix cgroup smp fairness
Commit ec4e0e2fe018992d980910db901637c814575914 ("fix
inconsistency when redistribute per-cpu tg->cfs_rq shares")
broke cgroup smp fairness.
In order to avoid starvation of newly placed tasks, we never
quite set the share of an empty cpu group-task to 0, but
instead we set it as if there's a single NICE-0 task present.
If however we actually set this in cfs_rq[cpu]->shares, that
means the total shares for that group will be slightly inflated
every time we balance, causing the observed unfairness.
Fix this by setting cfs_rq[cpu]->shares to 0 but actually
setting the effective weight of the related se to the inflated
number.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248696557.6987.1615.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 28 ++++++++++++++++++++--------
1 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index ce1056e..26976cd 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1523,13 +1523,18 @@ static void
update_group_shares_cpu(struct task_group *tg, int cpu,
unsigned long sd_shares, unsigned long sd_rq_weight)
{
- unsigned long shares;
unsigned long rq_weight;
+ unsigned long shares;
+ int boost = 0;
if (!tg->se[cpu])
return;
rq_weight = tg->cfs_rq[cpu]->rq_weight;
+ if (!rq_weight) {
+ boost = 1;
+ rq_weight = NICE_0_LOAD;
+ }
/*
* \Sum shares * rq_weight
@@ -1546,8 +1551,7 @@ update_group_shares_cpu(struct task_group *tg, int cpu,
unsigned long flags;
spin_lock_irqsave(&rq->lock, flags);
- tg->cfs_rq[cpu]->shares = shares;
-
+ tg->cfs_rq[cpu]->shares = boost ? 0 : shares;
__set_se_shares(tg->se[cpu], shares);
spin_unlock_irqrestore(&rq->lock, flags);
}
@@ -1560,7 +1564,7 @@ update_group_shares_cpu(struct task_group *tg, int cpu,
*/
static int tg_shares_up(struct task_group *tg, void *data)
{
- unsigned long weight, rq_weight = 0;
+ unsigned long weight, rq_weight = 0, eff_weight = 0;
unsigned long shares = 0;
struct sched_domain *sd = data;
int i;
@@ -1572,11 +1576,13 @@ static int tg_shares_up(struct task_group *tg, void *data)
* run here it will not get delayed by group starvation.
*/
weight = tg->cfs_rq[i]->load.weight;
+ tg->cfs_rq[i]->rq_weight = weight;
+ rq_weight += weight;
+
if (!weight)
weight = NICE_0_LOAD;
- tg->cfs_rq[i]->rq_weight = weight;
- rq_weight += weight;
+ eff_weight += weight;
shares += tg->cfs_rq[i]->shares;
}
@@ -1586,8 +1592,14 @@ static int tg_shares_up(struct task_group *tg, void *data)
if (!sd->parent || !(sd->parent->flags & SD_LOAD_BALANCE))
shares = tg->shares;
- for_each_cpu(i, sched_domain_span(sd))
- update_group_shares_cpu(tg, i, shares, rq_weight);
+ for_each_cpu(i, sched_domain_span(sd)) {
+ unsigned long sd_rq_weight = rq_weight;
+
+ if (!tg->cfs_rq[i]->rq_weight)
+ sd_rq_weight = eff_weight;
+
+ update_group_shares_cpu(tg, i, shares, sd_rq_weight);
+ }
return 0;
}
prev parent reply other threads:[~2009-08-02 13:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-23 7:57 CFS group scheduler fairness broken starting from 2.6.29-rc1 Bharata B Rao
2009-07-23 22:17 ` Ken Chen
2009-07-24 4:30 ` Bharata B Rao
2009-07-27 12:09 ` Peter Zijlstra
2009-07-28 4:14 ` Bharata B Rao
2009-07-28 7:28 ` Peter Zijlstra
2009-08-02 13:12 ` tip-bot for Peter Zijlstra [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-a5004278f0525dcb9aa43703ef77bf371ea837cd@git.kernel.org \
--to=a.p.zijlstra@chello.nl \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.