From: Peter Zijlstra <peterz@infradead.org>
To: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
containers <containers@lists.linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [BUG] cpu controller can't provide fair CPU time for each group
Date: Wed, 11 Nov 2009 08:20:07 +0100 [thread overview]
Message-ID: <1257924007.23203.18.camel@twins> (raw)
In-Reply-To: <20091111134910.5F42.E1E9C6FF@jp.fujitsu.com>
On Wed, 2009-11-11 at 15:21 +0900, Yasunori Goto wrote:
> When users use cpuset/cpu affinity, then they would like to controll cpu affinity.
> Not CPU time.
What are people using affinity for? The only use of affinity is to
restrict or disable the load-balancer. Don't complain the load-balancer
doesn't work when you're taking active steps to hinder its work.
If you don't want things load-balanced, turn it off, if you want the
load-balancer to work on smaller groups of cpus, use cpusets.
Anyway, I said there needs to be done something because the interaction
between cpusets and the cpu-controller is utter crap, they never should
have been separated like they are.
> To be honest, I don't have any good idea because I'm not familiar with
> schduler's code. But I have one question.
>
>
> 1618 static int tg_shares_up(struct task_group *tg, void *data)
> 1619 {
> 1620 unsigned long weight, rq_weight = 0, shares = 0;
>
> (snip)
>
> 1632 for_each_cpu(i, sched_domain_span(sd)) {
> 1633 weight = tg->cfs_rq[i]->load.weight;
> 1634 usd->rq_weight[i] = weight;
> 1635
> 1636 /*
> 1637 * If there are currently no tasks on the cpu pretend there
> 1638 * is one of average load so that when a new task gets to
> 1639 * run here it will not get delayed by group starvation.
> 1640 */
> 1641 if (!weight)
> 1642 weight = NICE_0_LOAD; ---------(*)
>
> I heard from test team when (*) was removed, 1) didn't occur.
>
> The comment said (*) is to avoid starvation condition.
> However, I don't understand why NICE_0_LOAD must be specified.
> Could you tell me why small value (like 2 or 3) is not used for (*)?
> What is side effect?
Exactly what the comment says, it will get delayed because the group
won't get scheduled on that cpu until all the group weights get
re-adjusted again, which can be much longer than the typical runtimes of
the workload in question.
Regular weights are NICE_0_LOAD, if you stick a 3 next to that I'll not
get ran much -> starvation.
next prev parent reply other threads:[~2009-11-11 7:20 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-03 2:26 [BUG] cpu controller can't provide fair CPU time for each group Miao Xie
2009-11-05 2:56 ` Miao Xie
2009-11-10 0:22 ` Andrew Morton
2009-11-10 9:48 ` Peter Zijlstra
2009-11-11 6:21 ` Yasunori Goto
2009-11-11 7:20 ` Peter Zijlstra [this message]
2009-11-11 9:59 ` Yasunori Goto
2009-11-11 20:39 ` Chris Friesen
2009-11-11 20:51 ` Peter Zijlstra
2009-11-11 10:07 ` Peter Zijlstra
2009-11-12 1:12 ` Yasunori Goto
2009-11-19 7:09 ` Yasunori Goto
2009-12-09 9:55 ` [tip:sched/urgent] sched: cgroup: Implement different treatment for idle shares tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1257924007.23203.18.camel@twins \
--to=peterz@infradead.org \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miaox@cn.fujitsu.com \
--cc=mingo@elte.hu \
--cc=y-goto@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox