From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: Preeti Murthy <preeti.lkml@gmail.com>,
mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
mikey@neuling.org, linux-tip-commits@vger.kernel.org
Subject: Re: [PATCH v2] sched: Check sched_domain before computing group power.
Date: Thu, 14 Nov 2013 11:36:27 +0530 [thread overview]
Message-ID: <52846863.4060001@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131113112329.GC543@linux.vnet.ibm.com>
Hi,
On 11/13/2013 04:53 PM, Srikar Dronamraju wrote:
> * Preeti Murthy <preeti.lkml@gmail.com> [2013-11-13 16:22:37]:
>
>> Hi Srikar,
>>
>> update_group_power() is called only during load balancing during
>> update_sg_lb_stats().
>> Load balancing begins at the base domain of the CPU,rq(cpu)->sd. This is
>> checked for
>> NULL. So how can update_group_power() be called in a scenario where the
>> base domain
>> of the CPU is not initialized? I say 'initialized' since you check for NULL
>> on rq(cpu)->sd.
>>
>
> update_group_power() also gets called from init_sched_groups_power().
> And if you see the oops message, we know that the oops happens from that
> path. In build_sched_domains(), we do cpu_attach_domain() what updates
> rq->sd after the call to init_sched_groups_power(). So by the time
> init_sched_groups_power() is called rq->sd is not yet initialized.
>
> We only hit oops case, when the sd->flags has SD_OVERLAP set.
>
>
>
>> In the changelog, you say 'updated'. Are you saying that it has a stale
>> value?
>
> I said, "before the sched_domain for a cpu is updated", so its not yet
> updated or has stale value. As I said earlier in this mail, the
> initialization happens after we do a update_group_power().
>
>> Please do elaborate on how you observed this.
>>
>
> Does this clarify?
Yes it clarifies, thank you.
However I was thinking that a better fix would be to reorder the way we call
update_group_power() and cpu_attach_domain(). Why do we need to do
update_group_power() of the groups of the sched domains that would probably
degenerate in cpu_attach_domain()? So it seemed best to move update_group_power()
to after cpu_attach_domain() so that it saves unnecessary iterations over
sched domains which could degenerate, and it fixes the issue that you have brought out
as well. See below for the patch:
-------------------------------------------------------------------------------
sched: Update power of sched groups after sched domains have been attached to CPUs
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Avoid iterating unnecessarily over the sched domains which could potentially
degenerate, while updating sched groups' power. This can be done by moving
the call to init_sched_groups_power() to after cpu_attach_domain(), when the
possibility of degenerating sched domains is examined and appropriately sched
domains are degenerated.
But claim_allocations() which does a NULL on the struct sd_data members for
each sched domain should iterate over all the initally built sched domains.
So move claim_allocations() to a loop where we build sched groups for each domain.
We would not require to reference sd_data after sched domains and sched
groups have been built.
Another use of this re-ordering is with reference to the commit
"sched/fair: Fix group power_orig computation". After this commit,
we end up dereferencing cpu_rq(cpu)->sd in update_group_power(). This would
lead to a NULL pointer since this parameter is updated after call to
update_group_power() in build_sched_domains() during initialization of sched
domains per CPU. The below change prevents this from occuring.
Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
---
kernel/sched/core.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e6a6244..d9703ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6211,17 +6211,7 @@ static int build_sched_domains(const struct cpumask *cpu_map,
if (build_sched_groups(sd, i))
goto error;
}
- }
- }
-
- /* Calculate CPU power for physical packages and nodes */
- for (i = nr_cpumask_bits-1; i >= 0; i--) {
- if (!cpumask_test_cpu(i, cpu_map))
- continue;
-
- for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
claim_allocations(i, sd);
- init_sched_groups_power(i, sd);
}
}
@@ -6233,6 +6223,16 @@ static int build_sched_domains(const struct cpumask *cpu_map,
}
rcu_read_unlock();
+ /* Calculate CPU power for physical packages and nodes */
+ for (i = nr_cpumask_bits-1; i >= 0; i--) {
+ if (!cpumask_test_cpu(i, cpu_map))
+ continue;
+
+ for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent)
+ init_sched_groups_power(i, sd);
+ }
+
+
ret = 0;
error:
__free_domain_allocs(&d, alloc_state, cpu_map);
>
Regards
Preeti U. Murthy
next prev parent reply other threads:[~2013-11-14 6:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-12 18:05 [tip:sched/core] sched/fair: Fix group power_orig computation tip-bot for Peter Zijlstra
2013-09-12 23:21 ` Michael Neuling
2013-11-12 10:55 ` Srikar Dronamraju
2013-11-12 11:57 ` Peter Zijlstra
2013-11-12 16:41 ` [PATCH v2] sched: Check sched_domain before computing group power Srikar Dronamraju
2013-11-12 17:03 ` Peter Zijlstra
2013-11-12 17:15 ` Srikar Dronamraju
2013-11-12 17:55 ` Peter Zijlstra
2013-11-13 5:55 ` Srikar Dronamraju
[not found] ` <CAM4v1pNMn=5oZDiX3fUp9uPkZTPJgk=vEKEjevzvpwn=PjTzXg@mail.gmail.com>
2013-11-13 11:23 ` Srikar Dronamraju
2013-11-14 6:06 ` Preeti U Murthy [this message]
2013-11-14 8:30 ` Peter Zijlstra
2013-11-14 9:12 ` Preeti U Murthy
2013-11-13 15:17 ` Peter Zijlstra
2013-11-14 10:50 ` Srikar Dronamraju
2013-11-14 11:15 ` Peter Zijlstra
2013-11-19 19:15 ` [tip:sched/urgent] " tip-bot for Srikar Dronamraju
2013-11-19 23:36 ` Yinghai Lu
2013-11-21 15:03 ` Peter Zijlstra
2013-11-21 17:22 ` Yinghai Lu
2013-11-21 22:03 ` Yinghai Lu
2013-11-28 3:02 ` David Rientjes
2013-11-28 7:07 ` Yinghai Lu
2013-11-28 9:38 ` Peter Zijlstra
2013-11-28 20:23 ` Yinghai Lu
2013-12-06 6:24 ` Yinghai Lu
2013-12-10 10:58 ` Peter Zijlstra
2013-12-10 21:26 ` Yinghai Lu
2013-11-22 12:07 ` Peter Zijlstra
2013-11-23 5:00 ` Yinghai Lu
2013-11-23 18:53 ` Peter Zijlstra
2013-11-28 2:57 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52846863.4060001@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=preeti.lkml@gmail.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).