public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/topology: Enable topology_span_sane check only for debug builds
@ 2024-10-22 17:57 Saurabh Sengar
  2024-10-23 16:39 ` Valentin Schneider
  0 siblings, 1 reply; 3+ messages in thread
From: Saurabh Sengar @ 2024-10-22 17:57 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel
  Cc: stable, ssengar, srivatsa

On a x86 system under test with 1780 CPUs, topology_span_sane() takes
around 8 seconds cumulatively for all the iterations. It is an expensive
operation which does the sanity of non-NUMA topology masks.

CPU topology is not something which changes very frequently hence make
this check optional for the systems where the topology is trusted and
need faster bootup.

Restrict this to SCHED_DEBUG builds so that this penalty can be avoided
for the systems who wants to avoid it.

Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
 kernel/sched/topology.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 9748a4c8d668..dacc8c6f978b 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2354,6 +2354,7 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
 	return sd;
 }
 
+#ifdef CONFIG_SCHED_DEBUG
 /*
  * Ensure topology masks are sane, i.e. there are no conflicts (overlaps) for
  * any two given CPUs at this (non-NUMA) topology level.
@@ -2387,6 +2388,7 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl,
 
 	return true;
 }
+#endif
 
 /*
  * Build sched domains for a given set of CPUs and attach the sched domains
@@ -2417,8 +2419,10 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 		sd = NULL;
 		for_each_sd_topology(tl) {
 
+#ifdef CONFIG_SCHED_DEBUG
 			if (WARN_ON(!topology_span_sane(tl, cpu_map, i)))
 				goto error;
+#endif
 
 			sd = build_sched_domain(tl, cpu_map, attr, sd, i);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] sched/topology: Enable topology_span_sane check only for debug builds
  2024-10-22 17:57 [PATCH] sched/topology: Enable topology_span_sane check only for debug builds Saurabh Sengar
@ 2024-10-23 16:39 ` Valentin Schneider
  2024-10-25  5:57   ` Saurabh Singh Sengar
  0 siblings, 1 reply; 3+ messages in thread
From: Valentin Schneider @ 2024-10-23 16:39 UTC (permalink / raw)
  To: Saurabh Sengar, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, linux-kernel
  Cc: stable, ssengar, srivatsa

On 22/10/24 10:57, Saurabh Sengar wrote:
> On a x86 system under test with 1780 CPUs, topology_span_sane() takes
> around 8 seconds cumulatively for all the iterations. It is an expensive
> operation which does the sanity of non-NUMA topology masks.
>
> CPU topology is not something which changes very frequently hence make
> this check optional for the systems where the topology is trusted and
> need faster bootup.
>
> Restrict this to SCHED_DEBUG builds so that this penalty can be avoided
> for the systems who wants to avoid it.
>
> Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>

Please see:
http://lore.kernel.org/r/20241010155111.230674-1-steve.wahl@hpe.com

Also note that most distros ship with CONFIG_SCHED_DEBUG=y, so while I'm
not 100% against it this would at the very least need to be gated behind
e.g. the sched_verbose cmdline argument to be useful.

But before that I'd like the "just run it once" option to be explored
first.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] sched/topology: Enable topology_span_sane check only for debug builds
  2024-10-23 16:39 ` Valentin Schneider
@ 2024-10-25  5:57   ` Saurabh Singh Sengar
  0 siblings, 0 replies; 3+ messages in thread
From: Saurabh Singh Sengar @ 2024-10-25  5:57 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, linux-kernel, stable, ssengar,
	srivatsa

On Wed, Oct 23, 2024 at 06:39:37PM +0200, Valentin Schneider wrote:
> On 22/10/24 10:57, Saurabh Sengar wrote:
> > On a x86 system under test with 1780 CPUs, topology_span_sane() takes
> > around 8 seconds cumulatively for all the iterations. It is an expensive
> > operation which does the sanity of non-NUMA topology masks.
> >
> > CPU topology is not something which changes very frequently hence make
> > this check optional for the systems where the topology is trusted and
> > need faster bootup.
> >
> > Restrict this to SCHED_DEBUG builds so that this penalty can be avoided
> > for the systems who wants to avoid it.
> >
> > Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
> > Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> 
> Please see:
> http://lore.kernel.org/r/20241010155111.230674-1-steve.wahl@hpe.com
> 
> Also note that most distros ship with CONFIG_SCHED_DEBUG=y, so while I'm
> not 100% against it this would at the very least need to be gated behind
> e.g. the sched_verbose cmdline argument to be useful.

Thanks for your review. I thought of using sched_verbose first, but I assumed
that many systems might not be using this command line option and I didn't
want them to have change in behaviour after my patch.

But if you think this is the right approach, I can send the V2.

> 
> But before that I'd like the "just run it once" option to be explored
> first.

That's a great improvement, but I understand there will still be a linear
penalty to pay for this sanity check. In my opinion, regardless of whether
these improvements are accepted or not, we should make this sanity check
optional.

- Saurabh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-10-25  5:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22 17:57 [PATCH] sched/topology: Enable topology_span_sane check only for debug builds Saurabh Sengar
2024-10-23 16:39 ` Valentin Schneider
2024-10-25  5:57   ` Saurabh Singh Sengar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox