* Re: [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks
2025-02-10 8:52 [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks lirongqing
@ 2025-02-10 9:02 ` Andrea Righi
2025-02-10 14:59 ` Changwoo Min
2025-02-10 17:21 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Andrea Righi @ 2025-02-10 9:02 UTC (permalink / raw)
To: lirongqing
Cc: tj, void, changwoo, mingo, peterz, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
linux-kernel
On Mon, Feb 10, 2025 at 04:52:25PM +0800, lirongqing wrote:
> From: Li RongQing <lirongqing@baidu.com>
>
> per-CPU cpumasks are dominantly accessed from their own local CPUs,
> so allocate them node-local to improve performance.
Makes sense to me, did you run some tests/benchmarks on any large NUMA
system with this?
Thanks,
-Andrea
>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> kernel/sched/ext.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 8857c07..3fe5a2e 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -6325,15 +6325,16 @@ void __init init_sched_ext_class(void)
>
> for_each_possible_cpu(cpu) {
> struct rq *rq = cpu_rq(cpu);
> + int n = cpu_to_node(cpu);
>
> init_dsq(&rq->scx.local_dsq, SCX_DSQ_LOCAL);
> INIT_LIST_HEAD(&rq->scx.runnable_list);
> INIT_LIST_HEAD(&rq->scx.ddsp_deferred_locals);
>
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_kick, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_kick_if_idle, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_preempt, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_wait, GFP_KERNEL));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_kick, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_kick_if_idle, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_preempt, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_wait, GFP_KERNEL, n));
> init_irq_work(&rq->scx.deferred_irq_work, deferred_irq_workfn);
> init_irq_work(&rq->scx.kick_cpus_irq_work, kick_cpus_irq_workfn);
>
> --
> 2.9.4
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks
2025-02-10 8:52 [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks lirongqing
2025-02-10 9:02 ` Andrea Righi
@ 2025-02-10 14:59 ` Changwoo Min
2025-02-10 17:21 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Changwoo Min @ 2025-02-10 14:59 UTC (permalink / raw)
To: lirongqing, tj, void, arighi, mingo, peterz, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, linux-kernel
Hello,
On 25. 2. 10. 17:52, lirongqing wrote:
> From: Li RongQing <lirongqing@baidu.com>
>
> per-CPU cpumasks are dominantly accessed from their own local CPUs,
> so allocate them node-local to improve performance.
>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> kernel/sched/ext.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 8857c07..3fe5a2e 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -6325,15 +6325,16 @@ void __init init_sched_ext_class(void)
>
> for_each_possible_cpu(cpu) {
> struct rq *rq = cpu_rq(cpu);
> + int n = cpu_to_node(cpu);
>
> init_dsq(&rq->scx.local_dsq, SCX_DSQ_LOCAL);
> INIT_LIST_HEAD(&rq->scx.runnable_list);
> INIT_LIST_HEAD(&rq->scx.ddsp_deferred_locals);
>
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_kick, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_kick_if_idle, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_preempt, GFP_KERNEL));
> - BUG_ON(!zalloc_cpumask_var(&rq->scx.cpus_to_wait, GFP_KERNEL));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_kick, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_kick_if_idle, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_preempt, GFP_KERNEL, n));
> + BUG_ON(!zalloc_cpumask_var_node(&rq->scx.cpus_to_wait, GFP_KERNEL, n));
> init_irq_work(&rq->scx.deferred_irq_work, deferred_irq_workfn);
> init_irq_work(&rq->scx.kick_cpus_irq_work, kick_cpus_irq_workfn);
>
The changes make sense to me. Thanks!
Acked-by: Changwoo Min <changwoo@igalia.com>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks
2025-02-10 8:52 [PATCH] sched_ext: Take NUMA node into account when allocating per-CPU cpumasks lirongqing
2025-02-10 9:02 ` Andrea Righi
2025-02-10 14:59 ` Changwoo Min
@ 2025-02-10 17:21 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2025-02-10 17:21 UTC (permalink / raw)
To: lirongqing
Cc: void, arighi, changwoo, mingo, peterz, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, linux-kernel
On Mon, Feb 10, 2025 at 04:52:25PM +0800, lirongqing wrote:
> From: Li RongQing <lirongqing@baidu.com>
>
> per-CPU cpumasks are dominantly accessed from their own local CPUs,
> so allocate them node-local to improve performance.
>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
Applied to sched_ext/for-6.15.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread