* [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node()
@ 2025-12-22 11:53 Zqiang
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Zqiang @ 2025-12-22 11:53 UTC (permalink / raw)
To: tj, void, arighi, changwoo; +Cc: sched-ext, linux-kernel, qiang.zhang
For the PREEMPT_RT kernels, the scx_bypass_lb_timerfn() running in the
preemptible per-CPU ktimer kthread context, this means that the following
scenarios will occur(for x86 platform):
cpu1 cpu2
ktimer kthread:
->scx_bypass_lb_timerfn
->bypass_lb_node
->for_each_cpu(cpu, resched_mask)
migration/1: by preempt by migration/2:
multi_cpu_stop() multi_cpu_stop()
->take_cpu_down()
->__cpu_disable()
->set cpu1 offline
->rq1 = cpu_rq(cpu1)
->resched_curr(rq1)
->smp_send_reschedule(cpu1)
->native_smp_send_reschedule(cpu1)
->if(unlikely(cpu_is_offline(cpu))) {
WARN(1, "sched: Unexpected
reschedule of offline CPU#%d!\n", cpu);
return;
}
This commit therefore use the resched_cpu() to replace resched_curr()
in the bypass_lb_node() to avoid send-ipi to offline CPUs.
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
---
kernel/sched/ext.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 5ebf8a740847..8f6d8d7f895c 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3956,13 +3956,8 @@ static void bypass_lb_node(struct scx_sched *sch, int node)
nr_donor_target, nr_target);
}
- for_each_cpu(cpu, resched_mask) {
- struct rq *rq = cpu_rq(cpu);
-
- raw_spin_rq_lock_irq(rq);
- resched_curr(rq);
- raw_spin_rq_unlock_irq(rq);
- }
+ for_each_cpu(cpu, resched_mask)
+ resched_cpu(cpu);
for_each_cpu_and(cpu, cpu_online_mask, node_mask) {
u32 nr = READ_ONCE(cpu_rq(cpu)->scx.bypass_dsq.nr);
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2025-12-22 11:53 [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Zqiang
@ 2025-12-22 11:53 ` Zqiang
2025-12-22 18:30 ` Andrea Righi
2025-12-23 4:00 ` Tejun Heo
2025-12-22 18:16 ` [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Andrea Righi
2025-12-23 4:00 ` Tejun Heo
2 siblings, 2 replies; 8+ messages in thread
From: Zqiang @ 2025-12-22 11:53 UTC (permalink / raw)
To: tj, void, arighi, changwoo; +Cc: sched-ext, linux-kernel, qiang.zhang
This commit only make irq_work_queue() to be called when the
llist_add() returns true.
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
---
kernel/sched/ext.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 8f6d8d7f895c..136b01950a62 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3439,8 +3439,8 @@ static void destroy_dsq(struct scx_sched *sch, u64 dsq_id)
* operations inside scheduler locks.
*/
dsq->id = SCX_DSQ_INVALID;
- llist_add(&dsq->free_node, &dsqs_to_free);
- irq_work_queue(&free_dsq_irq_work);
+ if (llist_add(&dsq->free_node, &dsqs_to_free))
+ irq_work_queue(&free_dsq_irq_work);
out_unlock_dsq:
raw_spin_unlock_irqrestore(&dsq->lock, flags);
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node()
2025-12-22 11:53 [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Zqiang
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
@ 2025-12-22 18:16 ` Andrea Righi
2025-12-23 4:00 ` Tejun Heo
2 siblings, 0 replies; 8+ messages in thread
From: Andrea Righi @ 2025-12-22 18:16 UTC (permalink / raw)
To: Zqiang; +Cc: tj, void, changwoo, sched-ext, linux-kernel
On Mon, Dec 22, 2025 at 07:53:17PM +0800, Zqiang wrote:
> For the PREEMPT_RT kernels, the scx_bypass_lb_timerfn() running in the
> preemptible per-CPU ktimer kthread context, this means that the following
> scenarios will occur(for x86 platform):
>
> cpu1 cpu2
> ktimer kthread:
> ->scx_bypass_lb_timerfn
> ->bypass_lb_node
> ->for_each_cpu(cpu, resched_mask)
>
> migration/1: by preempt by migration/2:
> multi_cpu_stop() multi_cpu_stop()
> ->take_cpu_down()
> ->__cpu_disable()
> ->set cpu1 offline
>
> ->rq1 = cpu_rq(cpu1)
> ->resched_curr(rq1)
> ->smp_send_reschedule(cpu1)
> ->native_smp_send_reschedule(cpu1)
> ->if(unlikely(cpu_is_offline(cpu))) {
> WARN(1, "sched: Unexpected
> reschedule of offline CPU#%d!\n", cpu);
> return;
> }
>
> This commit therefore use the resched_cpu() to replace resched_curr()
> in the bypass_lb_node() to avoid send-ipi to offline CPUs.
>
> Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Good catch, resched_cpu() checks for online CPUs, so makes sense to me.
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
> ---
> kernel/sched/ext.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 5ebf8a740847..8f6d8d7f895c 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3956,13 +3956,8 @@ static void bypass_lb_node(struct scx_sched *sch, int node)
> nr_donor_target, nr_target);
> }
>
> - for_each_cpu(cpu, resched_mask) {
> - struct rq *rq = cpu_rq(cpu);
> -
> - raw_spin_rq_lock_irq(rq);
> - resched_curr(rq);
> - raw_spin_rq_unlock_irq(rq);
> - }
> + for_each_cpu(cpu, resched_mask)
> + resched_cpu(cpu);
>
> for_each_cpu_and(cpu, cpu_online_mask, node_mask) {
> u32 nr = READ_ONCE(cpu_rq(cpu)->scx.bypass_dsq.nr);
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
@ 2025-12-22 18:30 ` Andrea Righi
2025-12-23 13:16 ` Zqiang
2025-12-23 4:00 ` Tejun Heo
1 sibling, 1 reply; 8+ messages in thread
From: Andrea Righi @ 2025-12-22 18:30 UTC (permalink / raw)
To: Zqiang; +Cc: tj, void, changwoo, sched-ext, linux-kernel
On Mon, Dec 22, 2025 at 07:53:18PM +0800, Zqiang wrote:
> This commit only make irq_work_queue() to be called when the
> llist_add() returns true.
Just to be more clear, we could rephrase the commit message as follows:
llist_add() returns true only when adding to an empty list, which indicates
that no IRQ work is currently queued or running. Therefore, we only need to
call irq_work_queue() when llist_add() returns true, to avoid unnecessarily
re-queueing IRQ work that is already pending or executing.
>
> Signed-off-by: Zqiang <qiang.zhang@linux.dev>
But overall, looks good to me.
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
> ---
> kernel/sched/ext.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 8f6d8d7f895c..136b01950a62 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3439,8 +3439,8 @@ static void destroy_dsq(struct scx_sched *sch, u64 dsq_id)
> * operations inside scheduler locks.
> */
> dsq->id = SCX_DSQ_INVALID;
> - llist_add(&dsq->free_node, &dsqs_to_free);
> - irq_work_queue(&free_dsq_irq_work);
> + if (llist_add(&dsq->free_node, &dsqs_to_free))
> + irq_work_queue(&free_dsq_irq_work);
>
> out_unlock_dsq:
> raw_spin_unlock_irqrestore(&dsq->lock, flags);
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node()
2025-12-22 11:53 [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Zqiang
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
2025-12-22 18:16 ` [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Andrea Righi
@ 2025-12-23 4:00 ` Tejun Heo
2 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2025-12-23 4:00 UTC (permalink / raw)
To: Zqiang; +Cc: void, arighi, changwoo, sched-ext, linux-kernel, emil
Applied to sched_ext/for-6.19-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
2025-12-22 18:30 ` Andrea Righi
@ 2025-12-23 4:00 ` Tejun Heo
2025-12-23 13:18 ` Zqiang
1 sibling, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2025-12-23 4:00 UTC (permalink / raw)
To: Zqiang; +Cc: void, arighi, changwoo, sched-ext, linux-kernel, emil
Applied to sched_ext/for-6.20.
Note: Commit message updated per Andrea's suggestion.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2025-12-22 18:30 ` Andrea Righi
@ 2025-12-23 13:16 ` Zqiang
0 siblings, 0 replies; 8+ messages in thread
From: Zqiang @ 2025-12-23 13:16 UTC (permalink / raw)
To: Andrea Righi; +Cc: tj, void, changwoo, sched-ext, linux-kernel
>
> On Mon, Dec 22, 2025 at 07:53:18PM +0800, Zqiang wrote:
>
> >
> > This commit only make irq_work_queue() to be called when the
> > llist_add() returns true.
> >
> Just to be more clear, we could rephrase the commit message as follows:
>
> llist_add() returns true only when adding to an empty list, which indicates
> that no IRQ work is currently queued or running. Therefore, we only need to
> call irq_work_queue() when llist_add() returns true, to avoid unnecessarily
> re-queueing IRQ work that is already pending or executing.
Thank you for make the commit more clear and reviewed :) .
Thanks
Zqiang
>
> >
> > Signed-off-by: Zqiang <qiang.zhang@linux.dev>
> >
> But overall, looks good to me.
>
> Reviewed-by: Andrea Righi <arighi@nvidia.com>
>
> Thanks,
> -Andrea
>
> >
> > ---
> > kernel/sched/ext.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index 8f6d8d7f895c..136b01950a62 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -3439,8 +3439,8 @@ static void destroy_dsq(struct scx_sched *sch, u64 dsq_id)
> > * operations inside scheduler locks.
> > */
> > dsq->id = SCX_DSQ_INVALID;
> > - llist_add(&dsq->free_node, &dsqs_to_free);
> > - irq_work_queue(&free_dsq_irq_work);
> > + if (llist_add(&dsq->free_node, &dsqs_to_free))
> > + irq_work_queue(&free_dsq_irq_work);
> >
> > out_unlock_dsq:
> > raw_spin_unlock_irqrestore(&dsq->lock, flags);
> > --
> > 2.17.1
> >
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2025-12-23 4:00 ` Tejun Heo
@ 2025-12-23 13:18 ` Zqiang
0 siblings, 0 replies; 8+ messages in thread
From: Zqiang @ 2025-12-23 13:18 UTC (permalink / raw)
To: Tejun Heo; +Cc: void, arighi, changwoo, sched-ext, linux-kernel, emil
>
> Applied to sched_ext/for-6.20.
>
> Note: Commit message updated per Andrea's suggestion.
Thank you for update commit message.
Thanks
Zqiang
>
> Thanks.
> --
> tejun
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-12-23 13:18 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-22 11:53 [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Zqiang
2025-12-22 11:53 ` [PATCH] sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq() Zqiang
2025-12-22 18:30 ` Andrea Righi
2025-12-23 13:16 ` Zqiang
2025-12-23 4:00 ` Tejun Heo
2025-12-23 13:18 ` Zqiang
2025-12-22 18:16 ` [PATCH] sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node() Andrea Righi
2025-12-23 4:00 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox