All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	Gautham R Shenoy <ego@linux.vnet.ibm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Rik van Riel <riel@surriel.com>,
	linuxppc-dev@lists.ozlabs.org,
	Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes
Date: Thu, 01 Jul 2021 15:28:45 +0100	[thread overview]
Message-ID: <875yxu85wi.mognet@arm.com> (raw)
In-Reply-To: <20210701041552.112072-2-srikar@linux.vnet.ibm.com>

On 01/07/21 09:45, Srikar Dronamraju wrote:
> @@ -1891,12 +1894,30 @@ void sched_init_numa(void)
>  void sched_domains_numa_masks_set(unsigned int cpu)
>  {
>       int node = cpu_to_node(cpu);
> -	int i, j;
> +	int i, j, empty;
>
> +	empty = cpumask_empty(sched_domains_numa_masks[0][node]);
>       for (i = 0; i < sched_domains_numa_levels; i++) {
>               for (j = 0; j < nr_node_ids; j++) {
> -			if (node_distance(j, node) <= sched_domains_numa_distance[i])
> +			if (!node_online(j))
> +				continue;
> +
> +			if (node_distance(j, node) <= sched_domains_numa_distance[i]) {
>                               cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
> +
> +				/*
> +				 * We skip updating numa_masks for offline
> +				 * nodes. However now that the node is
> +				 * finally online, CPUs that were added
> +				 * earlier, should now be accommodated into
> +				 * newly oneline node's numa mask.
> +				 */
> +				if (node != j && empty) {
> +					cpumask_or(sched_domains_numa_masks[i][node],
> +							sched_domains_numa_masks[i][node],
> +							sched_domains_numa_masks[0][j]);
> +				}
> +			}

Hmph, so we're playing games with masks of offline nodes - is that really
necessary? Your modification of sched_init_numa() still scans all of the
nodes (regardless of their online status) to build the distance map, and
that is never updated (sched_init_numa() is pretty much an __init
function).

So AFAICT this is all to cope with topology_span_sane() not applying
'cpu_map' to its masks. That seemed fine to me back when I wrote it, but in
light of having bogus distance values for offline nodes, not so much...

What about the below instead?

---
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b77ad49dc14f..c2d9caad4aa6 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2075,6 +2075,7 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
 static bool topology_span_sane(struct sched_domain_topology_level *tl,
 			      const struct cpumask *cpu_map, int cpu)
 {
+	struct cpumask *intersect = sched_domains_tmpmask;
 	int i;
 
 	/* NUMA levels are allowed to overlap */
@@ -2090,14 +2091,17 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl,
 	for_each_cpu(i, cpu_map) {
 		if (i == cpu)
 			continue;
+
 		/*
-		 * We should 'and' all those masks with 'cpu_map' to exactly
-		 * match the topology we're about to build, but that can only
-		 * remove CPUs, which only lessens our ability to detect
-		 * overlaps
+		 * We shouldn't have to bother with cpu_map here, unfortunately
+		 * some architectures (powerpc says hello) have to deal with
+		 * offline NUMA nodes reporting bogus distance values. This can
+		 * lead to funky NODE domain spans, but since those are offline
+		 * we can mask them out.
 		 */
+		cpumask_and(intersect, tl->mask(cpu), tl->mask(i));
 		if (!cpumask_equal(tl->mask(cpu), tl->mask(i)) &&
-		    cpumask_intersects(tl->mask(cpu), tl->mask(i)))
+		    cpumask_intersects(intersect, cpu_map))
 			return false;
 	}
 

WARNING: multiple messages have this Message-ID (diff)
From: Valentin Schneider <valentin.schneider@arm.com>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michael Ellerman <mpe@ellerman.id.au>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Rik van Riel <riel@surriel.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Nathan Lynch <nathanl@linux.ibm.com>,
	Gautham R Shenoy <ego@linux.vnet.ibm.com>,
	Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>,
	Laurent Dufour <ldufour@linux.ibm.com>
Subject: Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes
Date: Thu, 01 Jul 2021 15:28:45 +0100	[thread overview]
Message-ID: <875yxu85wi.mognet@arm.com> (raw)
In-Reply-To: <20210701041552.112072-2-srikar@linux.vnet.ibm.com>

On 01/07/21 09:45, Srikar Dronamraju wrote:
> @@ -1891,12 +1894,30 @@ void sched_init_numa(void)
>  void sched_domains_numa_masks_set(unsigned int cpu)
>  {
>       int node = cpu_to_node(cpu);
> -	int i, j;
> +	int i, j, empty;
>
> +	empty = cpumask_empty(sched_domains_numa_masks[0][node]);
>       for (i = 0; i < sched_domains_numa_levels; i++) {
>               for (j = 0; j < nr_node_ids; j++) {
> -			if (node_distance(j, node) <= sched_domains_numa_distance[i])
> +			if (!node_online(j))
> +				continue;
> +
> +			if (node_distance(j, node) <= sched_domains_numa_distance[i]) {
>                               cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
> +
> +				/*
> +				 * We skip updating numa_masks for offline
> +				 * nodes. However now that the node is
> +				 * finally online, CPUs that were added
> +				 * earlier, should now be accommodated into
> +				 * newly oneline node's numa mask.
> +				 */
> +				if (node != j && empty) {
> +					cpumask_or(sched_domains_numa_masks[i][node],
> +							sched_domains_numa_masks[i][node],
> +							sched_domains_numa_masks[0][j]);
> +				}
> +			}

Hmph, so we're playing games with masks of offline nodes - is that really
necessary? Your modification of sched_init_numa() still scans all of the
nodes (regardless of their online status) to build the distance map, and
that is never updated (sched_init_numa() is pretty much an __init
function).

So AFAICT this is all to cope with topology_span_sane() not applying
'cpu_map' to its masks. That seemed fine to me back when I wrote it, but in
light of having bogus distance values for offline nodes, not so much...

What about the below instead?

---
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b77ad49dc14f..c2d9caad4aa6 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2075,6 +2075,7 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
 static bool topology_span_sane(struct sched_domain_topology_level *tl,
 			      const struct cpumask *cpu_map, int cpu)
 {
+	struct cpumask *intersect = sched_domains_tmpmask;
 	int i;
 
 	/* NUMA levels are allowed to overlap */
@@ -2090,14 +2091,17 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl,
 	for_each_cpu(i, cpu_map) {
 		if (i == cpu)
 			continue;
+
 		/*
-		 * We should 'and' all those masks with 'cpu_map' to exactly
-		 * match the topology we're about to build, but that can only
-		 * remove CPUs, which only lessens our ability to detect
-		 * overlaps
+		 * We shouldn't have to bother with cpu_map here, unfortunately
+		 * some architectures (powerpc says hello) have to deal with
+		 * offline NUMA nodes reporting bogus distance values. This can
+		 * lead to funky NODE domain spans, but since those are offline
+		 * we can mask them out.
 		 */
+		cpumask_and(intersect, tl->mask(cpu), tl->mask(i));
 		if (!cpumask_equal(tl->mask(cpu), tl->mask(i)) &&
-		    cpumask_intersects(tl->mask(cpu), tl->mask(i)))
+		    cpumask_intersects(intersect, cpu_map))
 			return false;
 	}
 

  reply	other threads:[~2021-07-01 14:29 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-01  4:15 [PATCH v2 0/2] Skip numa distance for offline nodes Srikar Dronamraju
2021-07-01  4:15 ` Srikar Dronamraju
2021-07-01  4:15 ` [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes Srikar Dronamraju
2021-07-01  4:15   ` Srikar Dronamraju
2021-07-01 14:28   ` Valentin Schneider [this message]
2021-07-01 14:28     ` Valentin Schneider
2021-07-12 12:48     ` Srikar Dronamraju
2021-07-12 12:48       ` Srikar Dronamraju
2021-07-13 16:32       ` Valentin Schneider
2021-07-13 16:32         ` Valentin Schneider
2021-07-23 14:39         ` Srikar Dronamraju
2021-07-23 14:39           ` Srikar Dronamraju
2021-08-04 10:01           ` Srikar Dronamraju
2021-08-04 10:01             ` Srikar Dronamraju
2021-08-04 10:20             ` Valentin Schneider
2021-08-04 10:20               ` Valentin Schneider
2021-08-08 15:56           ` Valentin Schneider
2021-08-08 15:56             ` Valentin Schneider
2021-08-09  6:52             ` Srikar Dronamraju
2021-08-09  6:52               ` Srikar Dronamraju
2021-08-09 12:52               ` Valentin Schneider
2021-08-09 12:52                 ` Valentin Schneider
2021-08-10 11:47                 ` Srikar Dronamraju
2021-08-10 11:47                   ` Srikar Dronamraju
2021-08-16 10:33                   ` Srikar Dronamraju
2021-08-16 10:33                     ` Srikar Dronamraju
2021-08-17  0:01                     ` Valentin Schneider
2021-08-17  0:01                       ` Valentin Schneider
2021-07-01  4:15 ` [PATCH v2 2/2] powerpc/numa: Fill distance_lookup_table for offline nodes Srikar Dronamraju
2021-07-01  4:15   ` Srikar Dronamraju
2021-07-01  9:36   ` kernel test robot
2021-07-01  9:36     ` kernel test robot
2021-07-01 10:20   ` kernel test robot
2021-07-01 10:20     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yxu85wi.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=Geetika.Moolchandani1@ibm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ego@linux.vnet.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=nathanl@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.