Sched_ext development
 help / color / mirror / Atom feed
* sched_ext and large cpu counts
@ 2025-10-07 13:35 Phil Auld
  2025-10-08  2:37 ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Phil Auld @ 2025-10-07 13:35 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Tejun Heo, David Vernet, Changwoo Min, sched-ext, pauld

Hi Andrea (and other sched_ext folks),

I've got some partners with systems with > 4096 cpus. On those systems
sched_ext crashes at boot due to:

init_sched_ext_class() {
...
        scx_kick_cpus_pnt_seqs =
                __alloc_percpu(sizeof(scx_kick_cpus_pnt_seqs[0]) * nr_cpu_ids,
                               __alignof__(scx_kick_cpus_pnt_seqs[0]));
        BUG_ON(!scx_kick_cpus_pnt_seqs);
...

4096 * 8 bytes is 32768 and is the max you can precpu allocate.  Anything more
and the _alloc_percpu fails and WARNs

[    0.000000] illegal size (33792) or align (8) for percpu allocation
[    0.000000] WARNING: CPU: 0 PID: 0 at mm/percpu.c:1779 pcpu_alloc_noprof+0x715/0x820

I started looking into changing that to static which would have to be based on
NR_CPUS (8192 in our case).  Because it's N^2 that starts to be a lot space.

While looking at how it's used I had a different question.  The comment says

  * We busy-wait here to guarantee that no other task can
                         * be scheduled on our core before the target CPU has
                         * entered the resched path.

But pnt_seq is actually only updated if we enter the resched path AND switch
classes. That seems more restrictive that the comment seems to require, no?

Any ideas on how to do this in different way?

I'd rather not have to turn off the CONFIG but 512 MB is a lot of space to
allocate for this. 

Thanks for taking a look and any suggestions.

Cheers,
Phil

-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-10-13 20:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 13:35 sched_ext and large cpu counts Phil Auld
2025-10-08  2:37 ` Tejun Heo
2025-10-08  6:10   ` Andrea Righi
2025-10-08 20:53     ` Tejun Heo
2025-10-08 21:48       ` [PATCH v2] sched_ext: Allocate scx_kick_cpus_pnt_seqs lazily using kvzalloc() Tejun Heo
2025-10-08 22:24         ` Andrea Righi
2025-10-08 23:36           ` Tejun Heo
2025-10-08 23:38             ` Tejun Heo
2025-10-08 23:43             ` [PATCH v3] " Tejun Heo
2025-10-09  6:43               ` Andrea Righi
2025-10-09 12:06               ` Phil Auld
2025-10-10 13:02                 ` Phil Auld
2025-10-09 13:58               ` Emil Tsalapatis
2025-10-13 18:44               ` Tejun Heo
2025-10-13 20:13                 ` Andrea Righi
2025-10-08 11:23   ` sched_ext and large cpu counts Phil Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox