All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Steve Wahl <steve.wahl@hpe.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	linux-kernel@vger.kernel.org,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Vishal Chourasia <vishalc@linux.ibm.com>,
	samir <samir@linux.ibm.com>,
	Naman Jain <namjain@linux.microsoft.com>,
	Saurabh Singh Sengar <ssengar@linux.microsoft.com>,
	srivatsa@csail.mit.edu, Michael Kelley <mhklinux@outlook.com>,
	Russ Anderson <rja@hpe.com>, Dimitri Sivanich <sivanich@hpe.com>
Subject: Re: [PATCH v4 1/2] sched/topology: improve topology_span_sane speed
Date: Tue, 10 Jun 2025 14:07:01 +0300	[thread overview]
Message-ID: <20250610110701.GA256154@unreal> (raw)
In-Reply-To: <20250304160844.75373-2-steve.wahl@hpe.com>

On Tue, Mar 04, 2025 at 10:08:43AM -0600, Steve Wahl wrote:
> Use a different approach to topology_span_sane(), that checks for the
> same constraint of no partial overlaps for any two CPU sets for
> non-NUMA topology levels, but does so in a way that is O(N) rather
> than O(N^2).
> 
> Instead of comparing with all other masks to detect collisions, keep
> one mask that includes all CPUs seen so far and detect collisions with
> a single cpumask_intersects test.
> 
> If the current mask has no collisions with previously seen masks, it
> should be a new mask, which can be uniquely identified by the lowest
> bit set in this mask.  Keep a pointer to this mask for future
> reference (in an array indexed by the lowest bit set), and add the
> CPUs in this mask to the list of those seen.
> 
> If the current mask does collide with previously seen masks, it should
> be exactly equal to a mask seen before, looked up in the same array
> indexed by the lowest bit set in the mask, a single comparison.
> 
> Move the topology_span_sane() check out of the existing topology level
> loop, let it use its own loop so that the array allocation can be done
> only once, shared across levels.
> 
> On a system with 1920 processors (16 sockets, 60 cores, 2 threads),
> the average time to take one processor offline is reduced from 2.18
> seconds to 1.01 seconds.  (Off-lining 959 of 1920 processors took
> 34m49.765s without this change, 16m10.038s with this change in place.)
> 
> Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
> ---

<...>

>  
> +	if (WARN_ON(!topology_span_sane(cpu_map)))
> +		goto error;

Hi, 

This WARN_ON() generate the following splat in our regression over VMs.

 [    0.408379] ------------[ cut here ]------------
 [    0.409097] WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2486 build_sched_domains+0xe67/0x13a0
 [    0.410797] Modules linked in:
 [    0.411453] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc1_for_upstream_min_debug_2025_06_09_14_44 #1 NONE 
 [    0.413353] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [    0.415440] RIP: 0010:build_sched_domains+0xe67/0x13a0
 [    0.416458] Code: ff ff 8b 6c 24 08 48 8b 44 24 68 65 48 2b 05 60 24 d0 01 0f 85 03 05 00 00 48 83 c4 70 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b e9 65 fe ff ff 48 c7 c7 28 fb 08 82 4c 89 44 24 28 c6 05 e4
 [    0.417662] RSP: 0000:ffff8881002efe30 EFLAGS: 00010202
 [    0.418686] RAX: 00000000ffffff01 RBX: 0000000000000002 RCX: 00000000ffffff01
 [    0.419982] RDX: 00000000fffffff6 RSI: 0000000000000300 RDI: ffff888100047168
 [    0.421166] RBP: 0000000000000000 R08: ffff888100047168 R09: 0000000000000000
 [    0.422514] R10: ffffffff830dee80 R11: 0000000000000000 R12: ffff888100047168
 [    0.423820] R13: 0000000000000002 R14: ffff888100193480 R15: ffff888380030f40
 [    0.425164] FS:  0000000000000000(0000) GS:ffff8881b9b76000(0000) knlGS:0000000000000000
 [    0.426751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [    0.427832] CR2: ffff88843ffff000 CR3: 000000000282c001 CR4: 0000000000370eb0
 [    0.428818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [    0.430131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [    0.431429] Call Trace:
 [    0.431983]  <TASK>
 [    0.432500]  sched_init_smp+0x32/0xa0
 [    0.433069]  ? stop_machine+0x2c/0x40
 [    0.433821]  kernel_init_freeable+0xf5/0x260
 [    0.434682]  ? rest_init+0xc0/0xc0
 [    0.435399]  kernel_init+0x16/0x120
 [    0.436140]  ret_from_fork+0x5e/0xd0
 [    0.436817]  ? rest_init+0xc0/0xc0
 [    0.437526]  ret_from_fork_asm+0x11/0x20
 [    0.438335]  </TASK>
 [    0.438841] ---[ end trace 0000000000000000 ]---

Thanks

> +
>  	/* Build the groups for the domains */
>  	for_each_cpu(i, cpu_map) {
>  		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
> -- 
> 2.26.2
> 

  parent reply	other threads:[~2025-06-10 11:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-04 16:08 [PATCH v4 0/2] Improving topology_span_sane Steve Wahl
2025-03-04 16:08 ` [PATCH v4 1/2] sched/topology: improve topology_span_sane speed Steve Wahl
2025-04-08 19:05   ` [tip: sched/core] " tip-bot2 for Steve Wahl
2025-06-10 11:07   ` Leon Romanovsky [this message]
2025-06-10 11:33     ` [PATCH v4 1/2] " K Prateek Nayak
2025-06-10 12:36       ` Leon Romanovsky
2025-06-10 13:09         ` Leon Romanovsky
2025-06-10 19:39           ` Steve Wahl
2025-06-11  6:06             ` Leon Romanovsky
2025-06-11  6:56               ` K Prateek Nayak
2025-06-12  7:41                 ` Leon Romanovsky
2025-06-12  9:30                   ` K Prateek Nayak
2025-06-12 10:41                     ` K Prateek Nayak
2025-06-15  6:42                       ` Leon Romanovsky
2025-06-16 14:18                         ` Steve Wahl
2025-06-17  3:04                           ` K Prateek Nayak
2025-06-17  7:55                             ` Leon Romanovsky
2025-06-17  7:34                           ` Leon Romanovsky
2025-06-17  9:22                             ` K Prateek Nayak
2025-06-23  6:06                               ` K Prateek Nayak
2025-03-04 16:08 ` [PATCH v4 2/2] sched/topology: Refinement to topology_span_sane speedup Steve Wahl
2025-04-08 19:05   ` [tip: sched/core] " tip-bot2 for Steve Wahl
2025-03-06  6:46 ` [PATCH v4 0/2] Improving topology_span_sane K Prateek Nayak
2025-03-06 14:33 ` Valentin Schneider
2025-03-07 10:06 ` Madadi Vineeth Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250610110701.GA256154@unreal \
    --to=leon@kernel.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhklinux@outlook.com \
    --cc=mingo@redhat.com \
    --cc=namjain@linux.microsoft.com \
    --cc=peterz@infradead.org \
    --cc=rja@hpe.com \
    --cc=rostedt@goodmis.org \
    --cc=samir@linux.ibm.com \
    --cc=sivanich@hpe.com \
    --cc=srivatsa@csail.mit.edu \
    --cc=ssengar@linux.microsoft.com \
    --cc=steve.wahl@hpe.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vishalc@linux.ibm.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.