From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: mikey@neuling.org, vincent.guittot@linaro.org,
peterz@infradead.org, linux-kernel@vger.kernel.org,
Morten.Rasmussen@arm.com, bitbucket@online.de, anton@samba.org,
linuxppc-dev@lists.ozlabs.org, mingo@kernel.org, pjt@google.com
Subject: Re: [PATCH V2 2/2] sched: Remove un-necessary iteration over sched domains to update nr_busy_cpus
Date: Wed, 30 Oct 2013 15:33:32 +0530 [thread overview]
Message-ID: <5270D974.3090003@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131030092313.GA4196@linux.vnet.ibm.com>
Hi Kamalesh,
On 10/30/2013 02:53 PM, Kamalesh Babulal wrote:
> Hi Preeti,
>
>> nr_busy_cpus parameter is used by nohz_kick_needed() to find out the number
>> of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES flag set.
>> Therefore instead of updating nr_busy_cpus at every level of sched domain,
>> since it is irrelevant, we can update this parameter only at the parent
>> domain of the sd which has this flag set. Introduce a per-cpu parameter
>> sd_busy which represents this parent domain.
>>
>> In nohz_kick_needed() we directly query the nr_busy_cpus parameter
>> associated with the groups of sd_busy.
>>
>> By associating sd_busy with the highest domain which has
>> SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains which could
>> have this flag set and trigger nohz_idle_balancing if any of the levels have
>> more than one busy cpu.
>>
>> sd_busy is irrelevant for asymmetric load balancing. However sd_asym has been
>> introduced to represent the highest sched domain which has SD_ASYM_PACKING flag set
>> so that it can be queried directly when required.
>>
>> While we are at it, we might as well change the nohz_idle parameter to be
>> updated at the sd_busy domain level alone and not the base domain level of a CPU.
>> This will unify the concept of busy cpus at just one level of sched domain
>> where it is currently used.
>>
>> Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com>
>> ---
>> kernel/sched/core.c | 6 ++++++
>> kernel/sched/fair.c | 38 ++++++++++++++++++++------------------
>> kernel/sched/sched.h | 2 ++
>> 3 files changed, 28 insertions(+), 18 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index c06b8d3..e6a6244 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5271,6 +5271,8 @@ DEFINE_PER_CPU(struct sched_domain *, sd_llc);
>> DEFINE_PER_CPU(int, sd_llc_size);
>> DEFINE_PER_CPU(int, sd_llc_id);
>> DEFINE_PER_CPU(struct sched_domain *, sd_numa);
>> +DEFINE_PER_CPU(struct sched_domain *, sd_busy);
>> +DEFINE_PER_CPU(struct sched_domain *, sd_asym);
>>
>> static void update_top_cache_domain(int cpu)
>> {
>> @@ -5282,6 +5284,7 @@ static void update_top_cache_domain(int cpu)
>> if (sd) {
>> id = cpumask_first(sched_domain_span(sd));
>> size = cpumask_weight(sched_domain_span(sd));
>> + rcu_assign_pointer(per_cpu(sd_busy, cpu), sd->parent);
>> }
>
>
> consider a machine with single socket, dual core with HT enabled. The top most
> domain is also the highest domain with SD_SHARE_PKG_RESOURCES flag set,
> i.e MC domain (the machine toplogy consist of SIBLING and MC domain).
>
> # lstopo-no-graphics --no-bridges --no-io
> Machine (7869MB) + Socket L#0 + L3 L#0 (3072KB)
> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
> PU L#0 (P#0)
> PU L#1 (P#1)
> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
> PU L#2 (P#2)
> PU L#3 (P#3)
>
> With this approach parent of MC domain is NULL and given that sd_busy is NULL,
> nr_busy_cpus of sched domain sd_busy will never be incremented/decremented.
> Resulting is nohz_kick_needed returning 0.
Right and it *should* return 0. There is no sibling domain that can
offload tasks from it. Therefore there is no point kicking nohz idle
balance.
Regards
Preeti U Murthy
>
> Thanks,
> Kamalesh.
>
WARNING: multiple messages have this Message-ID (diff)
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: peterz@infradead.org, mikey@neuling.org,
svaidy@linux.vnet.ibm.com, mingo@kernel.org,
vincent.guittot@linaro.org, bitbucket@online.de,
benh@kernel.crashing.org, linux-kernel@vger.kernel.org,
anton@samba.org, linuxppc-dev@lists.ozlabs.org,
Morten.Rasmussen@arm.com, pjt@google.com
Subject: Re: [PATCH V2 2/2] sched: Remove un-necessary iteration over sched domains to update nr_busy_cpus
Date: Wed, 30 Oct 2013 15:33:32 +0530 [thread overview]
Message-ID: <5270D974.3090003@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131030092313.GA4196@linux.vnet.ibm.com>
Hi Kamalesh,
On 10/30/2013 02:53 PM, Kamalesh Babulal wrote:
> Hi Preeti,
>
>> nr_busy_cpus parameter is used by nohz_kick_needed() to find out the number
>> of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES flag set.
>> Therefore instead of updating nr_busy_cpus at every level of sched domain,
>> since it is irrelevant, we can update this parameter only at the parent
>> domain of the sd which has this flag set. Introduce a per-cpu parameter
>> sd_busy which represents this parent domain.
>>
>> In nohz_kick_needed() we directly query the nr_busy_cpus parameter
>> associated with the groups of sd_busy.
>>
>> By associating sd_busy with the highest domain which has
>> SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains which could
>> have this flag set and trigger nohz_idle_balancing if any of the levels have
>> more than one busy cpu.
>>
>> sd_busy is irrelevant for asymmetric load balancing. However sd_asym has been
>> introduced to represent the highest sched domain which has SD_ASYM_PACKING flag set
>> so that it can be queried directly when required.
>>
>> While we are at it, we might as well change the nohz_idle parameter to be
>> updated at the sd_busy domain level alone and not the base domain level of a CPU.
>> This will unify the concept of busy cpus at just one level of sched domain
>> where it is currently used.
>>
>> Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com>
>> ---
>> kernel/sched/core.c | 6 ++++++
>> kernel/sched/fair.c | 38 ++++++++++++++++++++------------------
>> kernel/sched/sched.h | 2 ++
>> 3 files changed, 28 insertions(+), 18 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index c06b8d3..e6a6244 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5271,6 +5271,8 @@ DEFINE_PER_CPU(struct sched_domain *, sd_llc);
>> DEFINE_PER_CPU(int, sd_llc_size);
>> DEFINE_PER_CPU(int, sd_llc_id);
>> DEFINE_PER_CPU(struct sched_domain *, sd_numa);
>> +DEFINE_PER_CPU(struct sched_domain *, sd_busy);
>> +DEFINE_PER_CPU(struct sched_domain *, sd_asym);
>>
>> static void update_top_cache_domain(int cpu)
>> {
>> @@ -5282,6 +5284,7 @@ static void update_top_cache_domain(int cpu)
>> if (sd) {
>> id = cpumask_first(sched_domain_span(sd));
>> size = cpumask_weight(sched_domain_span(sd));
>> + rcu_assign_pointer(per_cpu(sd_busy, cpu), sd->parent);
>> }
>
>
> consider a machine with single socket, dual core with HT enabled. The top most
> domain is also the highest domain with SD_SHARE_PKG_RESOURCES flag set,
> i.e MC domain (the machine toplogy consist of SIBLING and MC domain).
>
> # lstopo-no-graphics --no-bridges --no-io
> Machine (7869MB) + Socket L#0 + L3 L#0 (3072KB)
> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
> PU L#0 (P#0)
> PU L#1 (P#1)
> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
> PU L#2 (P#2)
> PU L#3 (P#3)
>
> With this approach parent of MC domain is NULL and given that sd_busy is NULL,
> nr_busy_cpus of sched domain sd_busy will never be incremented/decremented.
> Resulting is nohz_kick_needed returning 0.
Right and it *should* return 0. There is no sibling domain that can
offload tasks from it. Therefore there is no point kicking nohz idle
balance.
Regards
Preeti U Murthy
>
> Thanks,
> Kamalesh.
>
next prev parent reply other threads:[~2013-10-30 10:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-30 3:12 [PATCH V2 0/2] sched: Cleanups,fixes in nohz_kick_needed() Preeti U Murthy
2013-10-30 3:12 ` Preeti U Murthy
2013-10-30 3:12 ` [PATCH V2 1/2] sched: Fix asymmetric scheduling for POWER7 Preeti U Murthy
2013-10-30 3:12 ` Preeti U Murthy
2013-11-06 13:20 ` [tip:sched/core] " tip-bot for Vaidyanathan Srinivasan
2013-10-30 3:12 ` [PATCH V2 2/2] sched: Remove un-necessary iteration over sched domains to update nr_busy_cpus Preeti U Murthy
2013-10-30 3:12 ` Preeti U Murthy
2013-10-30 3:20 ` Preeti U Murthy
2013-10-30 3:20 ` Preeti U Murthy
2013-10-30 9:23 ` Kamalesh Babulal
2013-10-30 9:23 ` Kamalesh Babulal
2013-10-30 10:03 ` Preeti U Murthy [this message]
2013-10-30 10:03 ` Preeti U Murthy
2013-11-06 13:20 ` [tip:sched/core] sched: Remove unnecessary " tip-bot for Preeti U Murthy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5270D974.3090003@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=Morten.Rasmussen@arm.com \
--cc=anton@samba.org \
--cc=bitbucket@online.de \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.