* [PATCH 03/14] sched/topology: Fix building of overlapping sched-groups
[not found] <20170428131958.893188882@infradead.org>
@ 2017-04-28 13:20 ` Peter Zijlstra
2017-05-01 20:08 ` Rik van Riel
2017-04-28 13:20 ` [PATCH 10/14] sched/topology: Fix overlapping sched_group_mask Peter Zijlstra
1 sibling, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2017-04-28 13:20 UTC (permalink / raw)
To: mingo, lvenanci; +Cc: lwang, riel, efault, tglx, linux-kernel, peterz, stable
[-- Attachment #1: peterz-sched-fix-build-overlapping-groups.patch --]
[-- Type: text/plain, Size: 1472 bytes --]
When building the overlapping groups, we very obviously should start
with the previous domain of _this_ @cpu, not CPU-0.
This can be readily demonstrated with a topology like:
node 0 1 2 3
0: 10 20 30 20
1: 20 10 20 30
2: 30 20 10 20
3: 20 30 20 10
Where (for example) CPU1 ends up generating the following nonsensical groups:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 2 0
[] domain 1: span 0-3 level NUMA
[] groups: 1-3 (cpu_capacity = 3072) 0-1,3 (cpu_capacity = 3072)
Where the fact that domain 1 doesn't include a group with span 0-2 is
the obvious fail.
With patch this looks like:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 0 2
[] domain 1: span 0-3 level NUMA
[] groups: 0-2 (cpu_capacity = 3072) 0,2-3 (cpu_capacity = 3072)
Cc: stable@vger.kernel.org
Fixes: e3589f6c81e4 ("sched: Allow for overlapping sched_domain spans")
Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -525,7 +525,7 @@ build_overlap_sched_groups(struct sched_
cpumask_clear(covered);
- for_each_cpu(i, span) {
+ for_each_cpu_wrap(i, span, cpu) {
struct cpumask *sg_span;
if (cpumask_test_cpu(i, covered))
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 10/14] sched/topology: Fix overlapping sched_group_mask
[not found] <20170428131958.893188882@infradead.org>
2017-04-28 13:20 ` [PATCH 03/14] sched/topology: Fix building of overlapping sched-groups Peter Zijlstra
@ 2017-04-28 13:20 ` Peter Zijlstra
1 sibling, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2017-04-28 13:20 UTC (permalink / raw)
To: mingo, lvenanci; +Cc: lwang, riel, efault, tglx, linux-kernel, peterz, stable
[-- Attachment #1: peterz-sched-topo-sched_group_mask.patch --]
[-- Type: text/plain, Size: 2649 bytes --]
The point of sched_group_mask is to select those CPUs from
sched_group_cpus that can actually arrive at this balance domain.
The current code gets it wrong, as can be readily demonstrated with a
topology like:
node 0 1 2 3
0: 10 20 30 20
1: 20 10 20 30
2: 30 20 10 20
3: 20 30 20 10
Where (for example) domain 1 on CPU1 ends up with a mask that includes
CPU0:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 (mask: 1), 2, 0
[] domain 1: span 0-3 level NUMA
[] groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
This causes sched_balance_cpu() to compute the wrong CPU and
consequently should_we_balance() will terminate early resulting in
missed load-balance opportunities.
The fixed topology looks like:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 (mask: 1), 2, 0
[] domain 1: span 0-3 level NUMA
[] groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
(note: this relies on OVERLAP domains to always have children, this is
true because the regular topology domains are still here -- this is
before degenerate trimming)
Cc: stable@vger.kernel.org
Fixes: e3589f6c81e4 ("sched: Allow for overlapping sched_domain spans")
Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/topology.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -495,6 +495,9 @@ enum s_alloc {
/*
* Build an iteration mask that can exclude certain CPUs from the upwards
* domain traversal.
+ *
+ * Only CPUs that can arrive at this group should be considered to continue
+ * balancing.
*/
static void build_group_mask(struct sched_domain *sd, struct sched_group *sg)
{
@@ -505,11 +508,24 @@ static void build_group_mask(struct sche
for_each_cpu(i, sg_span) {
sibling = *per_cpu_ptr(sdd->sd, i);
- if (!cpumask_test_cpu(i, sched_domain_span(sibling)))
+
+ /*
+ * Can happen in the asymmetric case, where these siblings are
+ * unused. The mask will not be empty because those CPUs that
+ * do have the top domain _should_ span the domain.
+ */
+ if (!sibling->child)
+ continue;
+
+ /* If we would not end up here, we can't continue from here */
+ if (!cpumask_equal(sg_span, sched_domain_span(sibling->child)))
continue;
cpumask_set_cpu(i, sched_group_mask(sg));
}
+
+ /* We must not have empty masks here */
+ WARN_ON_ONCE(cpumask_empty(sched_group_mask(sg)));
}
/*
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 03/14] sched/topology: Fix building of overlapping sched-groups
2017-04-28 13:20 ` [PATCH 03/14] sched/topology: Fix building of overlapping sched-groups Peter Zijlstra
@ 2017-05-01 20:08 ` Rik van Riel
0 siblings, 0 replies; 3+ messages in thread
From: Rik van Riel @ 2017-05-01 20:08 UTC (permalink / raw)
To: Peter Zijlstra, mingo, lvenanci; +Cc: lwang, efault, tglx, linux-kernel, stable
On Fri, 2017-04-28 at 15:20 +0200, Peter Zijlstra wrote:
> When building the overlapping groups, we very obviously should start
> with the previous domain of _this_ @cpu, not CPU-0.
> With patch this looks like:
>
> [] CPU1 attaching sched-domain:
> [] domain 0: span 0-2 level NUMA
> [] groups: 1 0 2
> [] domain 1: span 0-3 level NUMA
> [] groups: 0-2 (cpu_capacity = 3072) 0,2-3 (cpu_capacity = 3072)
>
> Cc: stable@vger.kernel.org
> Fixes: e3589f6c81e4 ("sched: Allow for overlapping sched_domain
> spans")
> Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@redhat.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-05-01 20:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170428131958.893188882@infradead.org>
2017-04-28 13:20 ` [PATCH 03/14] sched/topology: Fix building of overlapping sched-groups Peter Zijlstra
2017-05-01 20:08 ` Rik van Riel
2017-04-28 13:20 ` [PATCH 10/14] sched/topology: Fix overlapping sched_group_mask Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).