From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Gautham R Shenoy <ego@in.ibm.com>,
linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH 1/5] sched: fix capacity calculations for SMT4
Date: Wed, 2 Jun 2010 04:22:50 +0530 [thread overview]
Message-ID: <20100601225250.GA7764@dirshya.in.ibm.com> (raw)
In-Reply-To: <1275294796.27810.21554.camel@twins>
* Peter Zijlstra <peterz@infradead.org> [2010-05-31 10:33:16]:
> On Fri, 2010-04-16 at 15:58 +0200, Peter Zijlstra wrote:
> >
> >
> > Hrmm, my brain seems muddled but I might have another solution, let me
> > ponder this for a bit..
> >
>
> Right, so the thing I was thinking about is taking the group capacity
> into account when determining the capacity for a single cpu.
>
> Say the group contains all the SMT siblings, then use the group capacity
> (usually larger than 1024) and then distribute the capacity over the
> group members, preferring CPUs with higher individual cpu_power over
> those with less.
>
> So suppose you've got 4 siblings with cpu_power=294 each, then we assign
> capacity 1 to the first member, and the remaining 153 is insufficient,
> and thus we stop and the rest lives with 0 capacity.
>
> Now take the example that the first sibling would be running a heavy RT
> load, and its cpu_power would be reduced to say, 50, then we still got
> nearly 933 left over the others, which is still sufficient for one
> capacity, but because the first sibling is low, we'll assign it 0 and
> instead assign 1 to the second, again, leaving the third and fourth 0.
Hi Peter,
Thanks for the suggestion.
> If the group were a core group, the total would be much higher and we'd
> likely end up assigning 1 to each before we'd run out of capacity.
This is a tricky case because we are depending upon the
DIV_ROUND_CLOSEST to decide whether to flag capacity to 0 or 1. We
will not have any task movement until capacity is depleted to quite
low value due to RT task. Having a threshold to flag 0/1 instead of
DIV_ROUND_CLOSEST just like you have suggested in the power savings
case may help here as well to move tasks to other idle cores.
> For power savings, we can lower the threshold and maybe use the maximal
> individual cpu_power in the group to base 1 capacity from.
>
> So, suppose the second example, where sibling0 has 50 and the others
> have 294, you'd end up with a capacity distribution of: {0,1,1,1}.
One challenge here is that if RT tasks run on more that one thread in
this group, we will have slightly different cpu powers. Arranging
them from max to min and having a cutoff threshold should work.
Should we keep the RT scaling as a separate entity along with
cpu_power to simplify these thresholds. Whenever we need to scale
group load with cpu power can take the product of cpu_power and
scale_rt_power but in these cases where we compute capacity, we can
mark a 0 or 1 just based on whether scale_rt_power was less than
SCHED_LOAD_SCALE or not. Alternatively we can keep cpu_power as
a product of all scaling factors as it is today but save the component
scale factors also like scale_rt_power() and arch_scale_freq_power()
so that it can be used in load balance decisions.
Basically in power save balance we would give all threads a capacity
'1' unless the cpu_power was reduced due to RT task. Similarly in
the non-power save case, we can have flag 1,0,0,0 unless first thread
had a RT scaling during the last interval.
I am suggesting to distinguish the reduction is cpu_power due to
architectural (hardware DVFS) reasons from RT tasks so that it is easy
to decide if moving tasks to sibling thread or core can help or not.
--Vaidy
next prev parent reply other threads:[~2010-06-01 22:52 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-09 6:21 [PATCH 0/5] sched: asymmetrical packing for POWER7 SMT4 Michael Neuling
2010-04-09 6:21 ` [PATCH 2/5] sched: add asymmetric packing option for sibling domain Michael Neuling
2010-04-13 12:29 ` Peter Zijlstra
2010-04-14 6:09 ` Michael Neuling
2010-04-09 6:21 ` [PATCH 4/5] sched: Mark the balance type for use in need_active_balance() Michael Neuling
2010-04-13 12:29 ` Peter Zijlstra
2010-04-15 4:15 ` Michael Neuling
2010-04-09 6:21 ` [PATCH 3/5] powerpc: enabled asymmetric SMT scheduling on POWER7 Michael Neuling
2010-04-09 6:48 ` Michael Neuling
2010-04-09 6:21 ` [PATCH 1/5] sched: fix capacity calculations for SMT4 Michael Neuling
2010-04-13 12:29 ` Peter Zijlstra
2010-04-14 4:28 ` Michael Neuling
2010-04-16 13:58 ` Peter Zijlstra
2010-04-18 21:34 ` Michael Neuling
2010-04-19 14:49 ` Peter Zijlstra
2010-04-19 20:45 ` Michael Neuling
2010-04-29 6:55 ` Michael Neuling
2010-05-31 8:33 ` Peter Zijlstra
2010-06-01 22:52 ` Vaidyanathan Srinivasan [this message]
2010-06-03 8:56 ` Peter Zijlstra
2010-06-07 15:06 ` Srivatsa Vaddagiri
2010-04-09 6:21 ` [PATCH 5/5] sched: make fix_small_imbalance work with asymmetric packing Michael Neuling
2010-04-13 12:29 ` Peter Zijlstra
2010-04-14 1:31 ` Suresh Siddha
2010-04-15 5:06 ` Michael Neuling
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100601225250.GA7764@dirshya.in.ibm.com \
--to=svaidy@linux.vnet.ibm.com \
--cc=ego@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mikey@neuling.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=suresh.b.siddha@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).