[PATCH v5 0/2] sched/fair: Optimize some active balance logic

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/2] sched/fair: Optimize some active balance logic
@ 2026-06-17  7:21 Xin Zhao
  2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Xin Zhao @ 2026-06-17  7:21 UTC (permalink / raw)
  To: vschneid, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, kprateek.nayak,
	pauld, aiqun.yu
  Cc: linux-kernel, Xin Zhao

Active balancing needs the help by migration threads which will interrupt
task on src_rq. It has a certain impact on overall performance. Active
balancing often fails, there is a check to determine whether the current
task(say it 'curr') on src_rq can run on dst_rq. We have observed that
even that, if curr is a CFS task and on_rq is 0, the failure rate of
active balancing is very high. Below are the test data from a certain
fillback task scenario executed on a platform with 18 CPUs over 300
seconds:

fair: busiest->curr->sched_class == &fair_sched_class
on_rq: busiest->curr->on_rq
total: active balance count triggered of correspondent type
fail: fail to migrate one task in active_load_balance_cpu_stop()

                 fair && !on_rq   !fair && !on_rq
       domain       total    fail   total    fail
cpu0   0x00003          0       0       0       0
cpu0   0x3ffff         33      33       1       1
cpu1   0x00003          0       0       0       0
cpu1   0x3ffff         42      42       0       0
cpu2   0x0003c          4       4       0       0
cpu2   0x3ffff         12      12       0       0
cpu3   0x0003c          3       3       0       0
cpu3   0x3ffff          8       7       0       0
cpu4   0x0003c          2       2       0       0
cpu4   0x3ffff          5       4       0       0
cpu5   0x0003c          4       4       0       0
cpu5   0x3ffff          8       8       0       0
cpu6   0x003c0         60      60       0       0
cpu6   0x3ffff         28      27       0       0
cpu7   0x003c0        194     184       0       0
cpu7   0x3ffff         35      35       1       1
cpu8   0x003c0        240     228       0       0
cpu8   0x3ffff         28      28       0       0
cpu9   0x003c0          0       0       0       0
cpu9   0x3ffff         10      10       0       0
cpu10  0x03c00         52      50       0       0
cpu10  0x3ffff          0       0       0       0
cpu11  0x03c00         70      68       0       0
cpu11  0x3ffff          1       1       0       0
cpu12  0x03c00         73      72       0       0
cpu12  0x3ffff          0       0       0       0
cpu13  0x03c00         79      76       0       0
cpu13  0x3ffff          0       0       0       0
cpu14  0x3c000          0       0       0       0
cpu14  0x3ffff         57      55       1       0
cpu15  0x3c000         53      52       1       0
cpu15  0x3ffff         30      29       0       0
cpu16  0x3c000        344     341      10       6
cpu16  0x3ffff        103     100       2       1
cpu17  0x3c000        183     179       2       2
cpu17  0x3ffff         78      77       0       0
sum                  1839    1791      18      11

In __schedule(), before setting curr to next, during the execution of
pick_next_task(), sched_balance_rq() is called. It will unlock and then
re-lock the rq, creating "holes" during which other CPUs may see zero
rq->curr->on_rq. try_to_block_task() sets curr->on_rq to 0, and during the
rq lock "hole" in pick_next_task(), rq->curr has not yet been assigned to
next, resulting in curr->on_rq being seen as 0.

We do not need to perform active balancing when src_rq->curr is CFS task
but on_rq is 0, as other CFS tasks have been probably checked just before.
For cases where src_rq->curr is a non-CFS task, we retain the affinity
check for dst_rq to trigger active balancing because such task is likely
to wake-up or woken-by src_rq CFS task which has similar affinity
characteristics to migrate. Also, after executing detach_tasks(), rq lock
is released. Tasks on the rq awakened during detach_tasks() may preempt
the previous CFS task. Based on my test(though not shown above), success
rate of active balancing under the condition of !fair && on_rq is 98.4%.
This scenario does not require the use of stop work, but need to add
another path to detach attach task(s). It seems not necessary enough to
add it, Valentin and Vincent have already discussed about it, see [1].

Additionally, sched_class field is a bit far from on_cpu in task_struct.
The previous traversal of cfs_tasks checks on_cpu in can_migrate_task(),
so the additional check for on_rq will not incur much cpu cycle loss, due
to cache locality.

Two reasons why not check sched_class and on_rq of busiest->curr with the
cpumask_test_cpu() check:
1. Let the PATCH not introduce new cases that skip logic for resetting
balance_interval to min_interval.
2. The check of whether busiest cpu has been just triggered active balance
filters a bit more cases than the check of sched_class and on_rq.

Additionally, in sched_balance_rq(), we unconditionally reset the
balance_interval to min_interval. The difference is that original logic
does not reset the balance_interval when dst_cpu softirq handler is
preempted while src_cpu successfully run the just-dispatched active
balancing, during the gaps between two need_active_balance() checks. It
seems that we haven't observed any substantial benefits from reducing the
opportunities for balance under such fluctuating conditions. So simplify
the need_active_balance() checks logic.

[1]: https://lore.kernel.org/lkml/20190815145107.5318-5-valentin.schneider@arm.com/

---
Change in v5:
- Get rid of the 'busiest->curr->sched_class == &fair_sched_class' check,
  as suggested by Valentin Schneider.
- Re-test the new condition, adjust and enrich the related commit log.

Change in v4:
- Add comment to explain why need to check busiest->curr->on_rq,
  as suggested by Valentin Schneider.
- Restructure the PATCH code, add one more label, make the code more
  comfortable to read,
  as suggested by Valentin Schneider.
- Link in v4: https://lore.kernel.org/all/20260616071859.343253-1-jackzxcui1989@163.com/

Changes in v3:
- Consider the cost by sched_class and on_rq check,
  as suggested by Aiqun(Maira) Yu.
  Move the check after the check of whether busiest cpu has been just
  triggered active balance.
- Separate the revise of balance_interval reset part to an independent
  patch, as suggested by Aiqun(Maira) Yu.
  Add more details about the independent patch.
- Link to v3: https://lore.kernel.org/all/20260615053809.3587677-1-jackzxcui1989@163.com/

Change in v2:
- Add reason in the commit log why we can see zero rq->curr->on_rq when we
  hold rq lock,
  as suggested by Valentin Schneider.
- Link to v2: https://lore.kernel.org/all/20260613073228.1951105-1-jackzxcui1989@163.com/

v1:
- Link to v1: https://lore.kernel.org/all/20260603125938.1938115-1-jackzxcui1989@163.com/

Xin Zhao (2):
  sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
  sched/fair: Simplify balance_interval reset logic in
    sched_balance_rq()

 kernel/sched/fair.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
  2026-06-17  7:21 [PATCH v5 0/2] sched/fair: Optimize some active balance logic Xin Zhao
@ 2026-06-17  7:21 ` Xin Zhao
  2026-06-17  9:30   ` Valentin Schneider
  2026-06-18  9:18   ` Peter Zijlstra
  2026-06-17  7:21 ` [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq() Xin Zhao
  2026-06-18 10:56 ` [PATCH v5 0/2] sched/fair: Optimize some active balance logic Peter Zijlstra
  2 siblings, 2 replies; 12+ messages in thread
From: Xin Zhao @ 2026-06-17  7:21 UTC (permalink / raw)
  To: vschneid, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, kprateek.nayak,
	pauld, aiqun.yu
  Cc: linux-kernel, Xin Zhao

Active balancing needs the help by migration threads which will interrupt
task on src_rq. It has a certain impact on overall performance. Active
balancing often fails, there is a check to determine whether the current
task(say it 'curr') on src_rq can run on dst_rq. We have observed that
even that, if curr is a CFS task and on_rq is 0, the failure rate of
active balancing is very high. Below are the test data from a certain
fillback task scenario executed on a platform with 18 CPUs over 300
seconds:

fair: busiest->curr->sched_class == &fair_sched_class
on_rq: busiest->curr->on_rq
total: active balance count triggered of correspondent type
fail: fail to migrate one task in active_load_balance_cpu_stop()

                 fair && !on_rq   !fair && !on_rq
       domain       total    fail   total    fail
cpu0   0x00003          0       0       0       0
cpu0   0x3ffff         33      33       1       1
cpu1   0x00003          0       0       0       0
cpu1   0x3ffff         42      42       0       0
cpu2   0x0003c          4       4       0       0
cpu2   0x3ffff         12      12       0       0
cpu3   0x0003c          3       3       0       0
cpu3   0x3ffff          8       7       0       0
cpu4   0x0003c          2       2       0       0
cpu4   0x3ffff          5       4       0       0
cpu5   0x0003c          4       4       0       0
cpu5   0x3ffff          8       8       0       0
cpu6   0x003c0         60      60       0       0
cpu6   0x3ffff         28      27       0       0
cpu7   0x003c0        194     184       0       0
cpu7   0x3ffff         35      35       1       1
cpu8   0x003c0        240     228       0       0
cpu8   0x3ffff         28      28       0       0
cpu9   0x003c0          0       0       0       0
cpu9   0x3ffff         10      10       0       0
cpu10  0x03c00         52      50       0       0
cpu10  0x3ffff          0       0       0       0
cpu11  0x03c00         70      68       0       0
cpu11  0x3ffff          1       1       0       0
cpu12  0x03c00         73      72       0       0
cpu12  0x3ffff          0       0       0       0
cpu13  0x03c00         79      76       0       0
cpu13  0x3ffff          0       0       0       0
cpu14  0x3c000          0       0       0       0
cpu14  0x3ffff         57      55       1       0
cpu15  0x3c000         53      52       1       0
cpu15  0x3ffff         30      29       0       0
cpu16  0x3c000        344     341      10       6
cpu16  0x3ffff        103     100       2       1
cpu17  0x3c000        183     179       2       2
cpu17  0x3ffff         78      77       0       0
sum                  1839    1791      18      11

In __schedule(), before setting curr to next, during the execution of
pick_next_task(), sched_balance_rq() is called. It will unlock and then
re-lock the rq, creating "holes" during which other CPUs may see zero
rq->curr->on_rq. try_to_block_task() sets curr->on_rq to 0, and during the
rq lock "hole" in pick_next_task(), rq->curr has not yet been assigned to
next, resulting in curr->on_rq being seen as 0.

We do not need to perform active balancing when src_rq->curr is CFS task
but on_rq is 0, as other CFS tasks have been probably checked just before.
For cases where src_rq->curr is a non-CFS task, we retain the affinity
check for dst_rq to trigger active balancing because such task is likely
to wake-up or woken-by src_rq CFS task which has similar affinity
characteristics to migrate. Also, after executing detach_tasks(), rq lock
is released. Tasks on the rq awakened during detach_tasks() may preempt
the previous CFS task. Based on my test(though not shown above), success
rate of active balancing under the condition of !fair && on_rq is 98.4%.
This scenario does not require the use of stop work, but need to add
another path to detach attach task(s). It seems not necessary enough to
add it, Valentin and Vincent have already discussed about it, see [1].

Additionally, sched_class field is a bit far from on_cpu in task_struct.
The previous traversal of cfs_tasks checks on_cpu in can_migrate_task(),
so the additional check for on_rq will not incur much cpu cycle loss, due
to cache locality.

Two reasons why not check sched_class and on_rq of busiest->curr with the
cpumask_test_cpu() check:
1. Let the PATCH not introduce new cases that skip logic for resetting
balance_interval to min_interval.
2. The check of whether busiest cpu has been just triggered active balance
filters a bit more cases than the check of sched_class and on_rq.

[1]: https://lore.kernel.org/lkml/20190815145107.5318-5-valentin.schneider@arm.com/

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
---

Change in v5:
- Get rid of the 'busiest->curr->sched_class == &fair_sched_class' check,
  as suggested by Valentin Schneider.
- Re-test the new condition, adjust and enrich the related commit log.

Change in v4:
- Add comment to explain why need to check busiest->curr->on_rq,
  as suggested by Valentin Schneider.
- Restructure the PATCH code, add one more label, make the code more
  comfortable to read,
  as suggested by Valentin Schneider.
- Link to v4: https://lore.kernel.org/all/20260616071859.343253-2-jackzxcui1989@163.com/

Change in v3:
- Consider the cost by sched_class and on_rq check,
  as suggested by Aiqun(Maira) Yu.
  Move the check after the check of whether busiest cpu has been just
  triggered active balance.
- Link to v3: https://lore.kernel.org/all/20260615053809.3587677-2-jackzxcui1989@163.com/

Change in v2:
- Add reason in the commit log why we can see zero rq->curr->on_rq when we
  hold rq lock,
  as suggested by Valentin Schneider.
- Link to v2: https://lore.kernel.org/all/20260613073228.1951105-1-jackzxcui1989@163.com/

v1:
- Link to v1: https://lore.kernel.org/all/20260603125938.1938115-1-jackzxcui1989@163.com/
---
 kernel/sched/fair.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b5819c489..2b9653623 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13436,12 +13436,21 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
 			 * ->active_balance_work.  Once set, it's cleared
 			 * only after active load balance is finished.
 			 */
-			if (!busiest->active_balance) {
-				busiest->active_balance = 1;
-				busiest->push_cpu = this_cpu;
-				active_balance = 1;
-			}
+			if (busiest->active_balance)
+				goto no_active_balance;
+
+			/*
+			 * @busiest dropped its rq_lock in the middle of
+			 * scheduling out its ->curr task (->on_rq := 0), no
+			 * need to forcefully punt it away with active balance.
+			 */
+			if (!busiest->curr->on_rq)
+				goto no_active_balance;

+			busiest->active_balance = 1;
+			busiest->push_cpu = this_cpu;
+			active_balance = 1;
+no_active_balance:
 			preempt_disable();
 			raw_spin_rq_unlock_irqrestore(busiest, flags);
 			if (active_balance) {
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()
  2026-06-17  7:21 [PATCH v5 0/2] sched/fair: Optimize some active balance logic Xin Zhao
  2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
@ 2026-06-17  7:21 ` Xin Zhao
  2026-06-18  9:40   ` Peter Zijlstra
  2026-06-18 10:56 ` [PATCH v5 0/2] sched/fair: Optimize some active balance logic Peter Zijlstra
  2 siblings, 1 reply; 12+ messages in thread
From: Xin Zhao @ 2026-06-17  7:21 UTC (permalink / raw)
  To: vschneid, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, kprateek.nayak,
	pauld, aiqun.yu
  Cc: linux-kernel, Xin Zhao

In sched_balance_rq(), it is possible to call need_active_balance() twice
in quick succession, which is not appropriate. There are two conditions in
sched_balance_rq() that reset balance_interval to min_interval, one is
when the local variable active_balance is 0, and the other is when
need_active_balance() returns a non-zero value. The local variable
active_balance is initialized to 0. Therefore, the only situation in which
balance_interval NOT be reset to min_interval is if need_active_balance()
has been executed once, marking the local variable active_balance as 1,
and then the second call to need_active_balance() returns 0. In other
words, the case is that during the interval between two close calls to
need_active_balance(), busiest rq completes the recently dispatched active
balance stop work, which is quite rare.

There are mainly two scenarios that lead to reaching sched_balance_rq():
one is the newly idle balance triggered by __schedule(), and the other is
the periodic balance logic controlled by sd->balance_interval or
nohz.next_balance, which ultimately executes in the softirq context. The
vast majority of cases executing sched_balance_rq() is the first scenario.
During the execution of __schedule(), preemption is disabled, so the
interval between two checks of need_active_balance() will not be long.
Thus, only in the second scenario, balance_interval may NOT be reset to
min_interval, but it's still not likely. The second scenario is in softirq
context, the execution of two need_active_balance() checks can be
preempted by other tasks, leading to a longer interval between the two
checks. However, there is no evidence to suggest that not resetting
min_interval in these low-probability cases caused by scheduling
preemption offers any significant benefits. It would be better to simplify
this complex reset logic for balance_interval to an unconditional reset.

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
---
 kernel/sched/fair.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2b9653623..9c78241e9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13464,10 +13464,8 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
 		sd->nr_balance_failed = 0;
 	}

-	if (likely(!active_balance) || need_active_balance(&env)) {
-		/* We were unbalanced, so reset the balancing interval */
-		sd->balance_interval = sd->min_interval;
-	}
+	/* We were unbalanced, so reset the balancing interval */
+	sd->balance_interval = sd->min_interval;

 	goto out;

-- 
2.34.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
  2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
@ 2026-06-17  9:30   ` Valentin Schneider
  2026-06-18  9:18   ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Valentin Schneider @ 2026-06-17  9:30 UTC (permalink / raw)
  To: Xin Zhao, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, kprateek.nayak,
	pauld, aiqun.yu
  Cc: linux-kernel, Xin Zhao

On 17/06/26 15:21, Xin Zhao wrote:
> Active balancing needs the help by migration threads which will interrupt
> task on src_rq. It has a certain impact on overall performance. Active
> balancing often fails, there is a check to determine whether the current
> task(say it 'curr') on src_rq can run on dst_rq. We have observed that
> even that, if curr is a CFS task and on_rq is 0, the failure rate of
> active balancing is very high. Below are the test data from a certain
> fillback task scenario executed on a platform with 18 CPUs over 300
> seconds:
>
> fair: busiest->curr->sched_class == &fair_sched_class
> on_rq: busiest->curr->on_rq
> total: active balance count triggered of correspondent type
> fail: fail to migrate one task in active_load_balance_cpu_stop()
>
>                  fair && !on_rq   !fair && !on_rq
>        domain       total    fail   total    fail
> cpu0   0x00003          0       0       0       0
> cpu0   0x3ffff         33      33       1       1
> cpu1   0x00003          0       0       0       0
> cpu1   0x3ffff         42      42       0       0
> cpu2   0x0003c          4       4       0       0
> cpu2   0x3ffff         12      12       0       0
> cpu3   0x0003c          3       3       0       0
> cpu3   0x3ffff          8       7       0       0
> cpu4   0x0003c          2       2       0       0
> cpu4   0x3ffff          5       4       0       0
> cpu5   0x0003c          4       4       0       0
> cpu5   0x3ffff          8       8       0       0
> cpu6   0x003c0         60      60       0       0
> cpu6   0x3ffff         28      27       0       0
> cpu7   0x003c0        194     184       0       0
> cpu7   0x3ffff         35      35       1       1
> cpu8   0x003c0        240     228       0       0
> cpu8   0x3ffff         28      28       0       0
> cpu9   0x003c0          0       0       0       0
> cpu9   0x3ffff         10      10       0       0
> cpu10  0x03c00         52      50       0       0
> cpu10  0x3ffff          0       0       0       0
> cpu11  0x03c00         70      68       0       0
> cpu11  0x3ffff          1       1       0       0
> cpu12  0x03c00         73      72       0       0
> cpu12  0x3ffff          0       0       0       0
> cpu13  0x03c00         79      76       0       0
> cpu13  0x3ffff          0       0       0       0
> cpu14  0x3c000          0       0       0       0
> cpu14  0x3ffff         57      55       1       0
> cpu15  0x3c000         53      52       1       0
> cpu15  0x3ffff         30      29       0       0
> cpu16  0x3c000        344     341      10       6
> cpu16  0x3ffff        103     100       2       1
> cpu17  0x3c000        183     179       2       2
> cpu17  0x3ffff         78      77       0       0
> sum                  1839    1791      18      11
>
> In __schedule(), before setting curr to next, during the execution of
> pick_next_task(), sched_balance_rq() is called. It will unlock and then
> re-lock the rq, creating "holes" during which other CPUs may see zero
> rq->curr->on_rq. try_to_block_task() sets curr->on_rq to 0, and during the
> rq lock "hole" in pick_next_task(), rq->curr has not yet been assigned to
> next, resulting in curr->on_rq being seen as 0.
>
> We do not need to perform active balancing when src_rq->curr is CFS task
> but on_rq is 0, as other CFS tasks have been probably checked just before.
> For cases where src_rq->curr is a non-CFS task, we retain the affinity
> check for dst_rq to trigger active balancing because such task is likely
> to wake-up or woken-by src_rq CFS task which has similar affinity
> characteristics to migrate. Also, after executing detach_tasks(), rq lock
> is released. Tasks on the rq awakened during detach_tasks() may preempt
> the previous CFS task. Based on my test(though not shown above), success
> rate of active balancing under the condition of !fair && on_rq is 98.4%.
> This scenario does not require the use of stop work, but need to add
> another path to detach attach task(s). It seems not necessary enough to
> add it, Valentin and Vincent have already discussed about it, see [1].
>
> Additionally, sched_class field is a bit far from on_cpu in task_struct.
> The previous traversal of cfs_tasks checks on_cpu in can_migrate_task(),
> so the additional check for on_rq will not incur much cpu cycle loss, due
> to cache locality.
>
> Two reasons why not check sched_class and on_rq of busiest->curr with the
> cpumask_test_cpu() check:
> 1. Let the PATCH not introduce new cases that skip logic for resetting
> balance_interval to min_interval.
> 2. The check of whether busiest cpu has been just triggered active balance
> filters a bit more cases than the check of sched_class and on_rq.
>
> [1]: https://lore.kernel.org/lkml/20190815145107.5318-5-valentin.schneider@arm.com/
>
> Signed-off-by: Xin Zhao <jackzxcui1989@163.com>

Reviewed-by: Valentin Schneider <vschneid@redhat.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
  2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
  2026-06-17  9:30   ` Valentin Schneider
@ 2026-06-18  9:18   ` Peter Zijlstra
  2026-06-18 10:09     ` Xin Zhao
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-18  9:18 UTC (permalink / raw)
  To: Xin Zhao
  Cc: vschneid, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, kprateek.nayak, pauld, aiqun.yu,
	linux-kernel

On Wed, Jun 17, 2026 at 03:21:50PM +0800, Xin Zhao wrote:
> Active balancing needs the help by migration threads which will interrupt
> task on src_rq. It has a certain impact on overall performance. Active
> balancing often fails, there is a check to determine whether the current
> task(say it 'curr') on src_rq can run on dst_rq. We have observed that
> even that, if curr is a CFS task and on_rq is 0, the failure rate of
> active balancing is very high. Below are the test data from a certain
> fillback task scenario executed on a platform with 18 CPUs over 300
> seconds:

<snip table for brevity>

> In __schedule(), before setting curr to next, during the execution of
> pick_next_task(), sched_balance_rq() is called. It will unlock and then
> re-lock the rq, creating "holes" during which other CPUs may see zero
> rq->curr->on_rq. try_to_block_task() sets curr->on_rq to 0, and during the
> rq lock "hole" in pick_next_task(), rq->curr has not yet been assigned to
> next, resulting in curr->on_rq being seen as 0.
> 
> We do not need to perform active balancing when src_rq->curr is CFS task
> but on_rq is 0, as other CFS tasks have been probably checked just before.
> For cases where src_rq->curr is a non-CFS task, we retain the affinity
> check for dst_rq to trigger active balancing because such task is likely
> to wake-up or woken-by src_rq CFS task which has similar affinity
> characteristics to migrate. Also, after executing detach_tasks(), rq lock
> is released. Tasks on the rq awakened during detach_tasks() may preempt
> the previous CFS task. Based on my test(though not shown above), success
> rate of active balancing under the condition of !fair && on_rq is 98.4%.
> This scenario does not require the use of stop work, but need to add
> another path to detach attach task(s). It seems not necessary enough to
> add it, Valentin and Vincent have already discussed about it, see [1].
> 
> Additionally, sched_class field is a bit far from on_cpu in task_struct.
> The previous traversal of cfs_tasks checks on_cpu in can_migrate_task(),
> so the additional check for on_rq will not incur much cpu cycle loss, due
> to cache locality.
> 
> Two reasons why not check sched_class and on_rq of busiest->curr with the
> cpumask_test_cpu() check:
> 1. Let the PATCH not introduce new cases that skip logic for resetting
> balance_interval to min_interval.
> 2. The check of whether busiest cpu has been just triggered active balance
> filters a bit more cases than the check of sched_class and on_rq.
> 
> [1]: https://lore.kernel.org/lkml/20190815145107.5318-5-valentin.schneider@arm.com/
> 
> Signed-off-by: Xin Zhao <jackzxcui1989@163.com>

Perhaps because it is early and I've not had enough wake-up juice, or
perhaps because it is too damn warm already and my brain is
pre-emptively shutting down already, I found it very hard to read your
Changelog.

I asked one of these fancy AI things to rephrase things, and then did a
manual edit on it and ended up with the below. Notably, the changelog
talks about a sched_class check, while the actual patch has none of
that.

Does this work for you?

---
Subject: sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
From: Xin Zhao <jackzxcui1989@163.com>
Date: Wed, 17 Jun 2026 15:21:50 +0800

From: Xin Zhao <jackzxcui1989@163.com>

Active load balancing relies on migration threads, which temporarily preempt
tasks on the source runqueue (src_rq). This preemption can negatively impact
overall system performance. The active balancing logic includes a check to
verify whether the current task (curr) on src_rq can actually run on the
destination runqueue (dst_rq). We have observed that when curr is a CFS task
and its on_rq flag is 0, the active balancing failure rate is exceptionally
high. The following table summarizes test data collected over 300 seconds on an
18-CPU platform under a specific fillback task scenario:

  fair: busiest->curr->sched_class == &fair_sched_class
  on_rq: busiest->curr->on_rq
  total: active balance count triggered of correspondent type
  fail: fail to migrate one task in active_load_balance_cpu_stop()

                   fair && !on_rq   !fair && !on_rq
         domain       total    fail   total    fail
  cpu0   0x00003          0       0       0       0
  cpu0   0x3ffff         33      33       1       1
  cpu1   0x00003          0       0       0       0
  cpu1   0x3ffff         42      42       0       0
  cpu2   0x0003c          4       4       0       0
  cpu2   0x3ffff         12      12       0       0
  cpu3   0x0003c          3       3       0       0
  cpu3   0x3ffff          8       7       0       0
  cpu4   0x0003c          2       2       0       0
  cpu4   0x3ffff          5       4       0       0
  cpu5   0x0003c          4       4       0       0
  cpu5   0x3ffff          8       8       0       0
  cpu6   0x003c0         60      60       0       0
  cpu6   0x3ffff         28      27       0       0
  cpu7   0x003c0        194     184       0       0
  cpu7   0x3ffff         35      35       1       1
  cpu8   0x003c0        240     228       0       0
  cpu8   0x3ffff         28      28       0       0
  cpu9   0x003c0          0       0       0       0
  cpu9   0x3ffff         10      10       0       0
  cpu10  0x03c00         52      50       0       0
  cpu10  0x3ffff          0       0       0       0
  cpu11  0x03c00         70      68       0       0
  cpu11  0x3ffff          1       1       0       0
  cpu12  0x03c00         73      72       0       0
  cpu12  0x3ffff          0       0       0       0
  cpu13  0x03c00         79      76       0       0
  cpu13  0x3ffff          0       0       0       0
  cpu14  0x3c000          0       0       0       0
  cpu14  0x3ffff         57      55       1       0
  cpu15  0x3c000         53      52       1       0
  cpu15  0x3ffff         30      29       0       0
  cpu16  0x3c000        344     341      10       6
  cpu16  0x3ffff        103     100       2       1
  cpu17  0x3c000        183     179       2       2
  cpu17  0x3ffff         78      77       0       0
  sum                  1839    1791      18      11

In __schedule(), before curr is updated to next, pick_next_task() invokes
sched_balance_rq(). This function temporarily unlocks and relocks the runqueue,
creating a window where other CPUs may observe rq->curr->on_rq as 0.

We can safely skip active balancing when src_rq->curr->on_rq == 0, as other
eligible tasks have likely already been evaluated. 

We retain the affinity check on dst_rq to trigger active balancing, since such
tasks are often woken by (or wake up) tasks on src_rq that share similar
affinity constraints. Furthermore, detach_tasks() releases the runqueue lock;
any tasks awakened during this window may preempt the previous CFS task. My
testing (data not shown) indicates that active balancing succeeds in 98.4% of
cases where !fair && on_rq.

This scenario does not require a stop-work callback, but would necessitate an
additional detach/attach path. As Valentin and Vincent have already discussed,
this addition does not appear justified at this time (see [1]).

Since can_migrate_task() already checks on_cpu during the cfs_tasks traversal,
adding an on_rq check will have negligible performance overhead due to cache
locality.

There are two reasons for not combining the on_rq check with the
cpumask_test_cpu() check:

 - Avoiding new scenarios that would skip the logic for resetting
   balance_interval to min_interval.

 - The existing check for whether the busiest CPU recently triggered active
   load balancing already filters more cases than the on_rq check.

[1]: https://lore.kernel.org/lkml/20190815145107.5318-5-valentin.schneider@arm.com/

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://patch.msgid.link/20260617072151.1173416-2-jackzxcui1989@163.com
---
 kernel/sched/fair.c |   19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13482,12 +13482,21 @@ static int sched_balance_rq(int this_cpu
 			 * ->active_balance_work.  Once set, it's cleared
 			 * only after active load balance is finished.
 			 */
-			if (!busiest->active_balance) {
-				busiest->active_balance = 1;
-				busiest->push_cpu = this_cpu;
-				active_balance = 1;
-			}
+			if (busiest->active_balance)
+				goto no_active_balance;
 
+			/*
+			 * @busiest dropped its rq_lock in the middle of
+			 * scheduling out its ->curr task (->on_rq := 0), no
+			 * need to forcefully punt it away with active balance.
+			 */
+			if (!busiest->curr->on_rq)
+				goto no_active_balance;
+
+			busiest->active_balance = 1;
+			busiest->push_cpu = this_cpu;
+			active_balance = 1;
+no_active_balance:
 			preempt_disable();
 			raw_spin_rq_unlock_irqrestore(busiest, flags);
 			if (active_balance) {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()
  2026-06-17  7:21 ` [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq() Xin Zhao
@ 2026-06-18  9:40   ` Peter Zijlstra
  2026-06-18 10:17     ` Xin Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-18  9:40 UTC (permalink / raw)
  To: Xin Zhao
  Cc: vschneid, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, kprateek.nayak, pauld, aiqun.yu,
	linux-kernel

On Wed, Jun 17, 2026 at 03:21:51PM +0800, Xin Zhao wrote:
> In sched_balance_rq(), it is possible to call need_active_balance() twice
> in quick succession, which is not appropriate. There are two conditions in
> sched_balance_rq() that reset balance_interval to min_interval, one is
> when the local variable active_balance is 0, and the other is when
> need_active_balance() returns a non-zero value. The local variable
> active_balance is initialized to 0. Therefore, the only situation in which
> balance_interval NOT be reset to min_interval is if need_active_balance()
> has been executed once, marking the local variable active_balance as 1,
> and then the second call to need_active_balance() returns 0. In other
> words, the case is that during the interval between two close calls to
> need_active_balance(), busiest rq completes the recently dispatched active
> balance stop work, which is quite rare.
> 
> There are mainly two scenarios that lead to reaching sched_balance_rq():
> one is the newly idle balance triggered by __schedule(), and the other is
> the periodic balance logic controlled by sd->balance_interval or
> nohz.next_balance, which ultimately executes in the softirq context. The
> vast majority of cases executing sched_balance_rq() is the first scenario.
> During the execution of __schedule(), preemption is disabled, so the
> interval between two checks of need_active_balance() will not be long.
> Thus, only in the second scenario, balance_interval may NOT be reset to
> min_interval, but it's still not likely. The second scenario is in softirq
> context, the execution of two need_active_balance() checks can be
> preempted by other tasks, leading to a longer interval between the two
> checks. However, there is no evidence to suggest that not resetting
> min_interval in these low-probability cases caused by scheduling
> preemption offers any significant benefits. It would be better to simplify
> this complex reset logic for balance_interval to an unconditional reset.

This is very confusing, and my AI helper isn't helping much this time
around.

active_balance is initialized 0, it is only (but not always) set 1 when
need_active_balance().

Therefore, the condition: !active_balance || need_active_balance() is a
truism and can be removed.

Or am I missing something more complicated?

> Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
> ---
>  kernel/sched/fair.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2b9653623..9c78241e9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -13464,10 +13464,8 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
>  		sd->nr_balance_failed = 0;
>  	}
>  
> -	if (likely(!active_balance) || need_active_balance(&env)) {
> -		/* We were unbalanced, so reset the balancing interval */
> -		sd->balance_interval = sd->min_interval;
> -	}
> +	/* We were unbalanced, so reset the balancing interval */
> +	sd->balance_interval = sd->min_interval;
>  
>  	goto out;
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq
  2026-06-18  9:18   ` Peter Zijlstra
@ 2026-06-18 10:09     ` Xin Zhao
  0 siblings, 0 replies; 12+ messages in thread
From: Xin Zhao @ 2026-06-18 10:09 UTC (permalink / raw)
  To: peterz
  Cc: aiqun.yu, bsegall, dietmar.eggemann, jackzxcui1989, juri.lelli,
	kprateek.nayak, linux-kernel, mgorman, mingo, pauld, rostedt,
	vincent.guittot, vschneid

On Thu, 18 Jun 2026 11:18:04 +0200 Peter Zijlstra <peterz@infradead.org> wrote:

> Perhaps because it is early and I've not had enough wake-up juice, or
> perhaps because it is too damn warm already and my brain is
> pre-emptively shutting down already, I found it very hard to read your
> Changelog.
> 
> I asked one of these fancy AI things to rephrase things, and then did a
> manual edit on it and ended up with the below. Notably, the changelog
> talks about a sched_class check, while the actual patch has none of
> that.
> 
> Does this work for you?

Thank you very much for correcting my poor English. Perhaps the large
language model I used simply couldn't handle my expressions in Chinese.
Once again, I appreciate your corrections on several inappropriate word
choices.

Presenting experimental data for both fair and !fair indicates that the
!fair scenario can also benefit from !on_rq filtering. I believe your
modified commit log is good enough.

Thanks
Xin Zhao


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()
  2026-06-18  9:40   ` Peter Zijlstra
@ 2026-06-18 10:17     ` Xin Zhao
  2026-06-18 10:31       ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Xin Zhao @ 2026-06-18 10:17 UTC (permalink / raw)
  To: peterz
  Cc: aiqun.yu, bsegall, dietmar.eggemann, jackzxcui1989, juri.lelli,
	kprateek.nayak, linux-kernel, mgorman, mingo, pauld, rostedt,
	vincent.guittot, vschneid

On Thu, 18 Jun 2026 11:40:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote:

> This is very confusing, and my AI helper isn't helping much this time
> around.
> 
> active_balance is initialized 0, it is only (but not always) set 1 when
> need_active_balance().
> 
> Therefore, the condition: !active_balance || need_active_balance() is a
> truism and can be removed.
> 
> Or am I missing something more complicated?

Sorry for my poor English again.

I will change the commit log as below:

active_balance is initialized 0, it is only (but not always) set 1 when
need_active_balance().

Therefore, the condition: !active_balance || need_active_balance() is a
truism in most cases and can be removed.

Thanks
Xin Zhao


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()
  2026-06-18 10:17     ` Xin Zhao
@ 2026-06-18 10:31       ` Peter Zijlstra
  2026-06-18 10:49         ` Xin Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-18 10:31 UTC (permalink / raw)
  To: Xin Zhao
  Cc: aiqun.yu, bsegall, dietmar.eggemann, juri.lelli, kprateek.nayak,
	linux-kernel, mgorman, mingo, pauld, rostedt, vincent.guittot,
	vschneid

On Thu, Jun 18, 2026 at 06:17:40PM +0800, Xin Zhao wrote:
> On Thu, 18 Jun 2026 11:40:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > This is very confusing, and my AI helper isn't helping much this time
> > around.
> > 
> > active_balance is initialized 0, it is only (but not always) set 1 when
> > need_active_balance().
> > 
> > Therefore, the condition: !active_balance || need_active_balance() is a
> > truism and can be removed.
> > 
> > Or am I missing something more complicated?
> 
> Sorry for my poor English again.

No need to be; English isn't my native tongue either, although it is
much closer linguistically. It just takes a little patience (and LLM
help these days), but we'll get there.

> I will change the commit log as below:

No need, I shall edit, the patch itself looked fine.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq()
  2026-06-18 10:31       ` Peter Zijlstra
@ 2026-06-18 10:49         ` Xin Zhao
  0 siblings, 0 replies; 12+ messages in thread
From: Xin Zhao @ 2026-06-18 10:49 UTC (permalink / raw)
  To: peterz
  Cc: aiqun.yu, bsegall, dietmar.eggemann, jackzxcui1989, juri.lelli,
	kprateek.nayak, linux-kernel, mgorman, mingo, pauld, rostedt,
	vincent.guittot, vschneid

On Thu, 18 Jun 2026 12:31:51 +0200 Peter Zijlstra <peterz@infradead.org> wrote:

> No need to be; English isn't my native tongue either, although it is
> much closer linguistically. It just takes a little patience (and LLM
> help these days), but we'll get there.
> 
> > I will change the commit log as below:
> 
> No need, I shall edit, the patch itself looked fine.

Great Thx.

Thanks
Xin Zhao


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/2] sched/fair: Optimize some active balance logic
  2026-06-17  7:21 [PATCH v5 0/2] sched/fair: Optimize some active balance logic Xin Zhao
  2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
  2026-06-17  7:21 ` [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq() Xin Zhao
@ 2026-06-18 10:56 ` Peter Zijlstra
  2026-06-18 13:56   ` Xin Zhao
  2 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2026-06-18 10:56 UTC (permalink / raw)
  To: Xin Zhao
  Cc: vschneid, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, kprateek.nayak, pauld, aiqun.yu,
	linux-kernel



And since I've been staring at this code far too long, I accidentally
did the below cleanup on top.


---
Subject: sched/fair: Reflow sched_balance_rq()
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu Jun 18 10:51:49 CEST 2026

Reflow to reduce indenting.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/sched/fair.c  |  136 ++++++++++++++++++++++++---------------------------
 kernel/sched/sched.h |   19 ++++++-
 2 files changed, 82 insertions(+), 73 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13437,82 +13437,78 @@ static int sched_balance_rq(int this_cpu
 		}
 	}
 
-	if (!ld_moved) {
-		schedstat_inc(sd->lb_failed[idle]);
+	if (ld_moved) {
+		sd->nr_balance_failed = 0;
+		goto out_unbalanced;
+	}
+
+	schedstat_inc(sd->lb_failed[idle]);
+	/*
+	 * Increment the failure counter only on periodic balance.
+	 * We do not want newidle balance, which can be very
+	 * frequent, pollute the failure counter causing
+	 * excessive cache_hot migrations and active balances.
+	 *
+	 * Similarly for migration_misfit which is not related to
+	 * load/util migration, don't pollute nr_balance_failed.
+	 *
+	 * The same for cache aware scheduling's allowance for
+	 * load imbalance. If regular load balance does not
+	 * migrate task due to LLC locality, it is a expected
+	 * behavior and don't pollute nr_balance_failed.
+	 * See can_migrate_task().
+	 */
+	if (idle != CPU_NEWLY_IDLE &&
+	    env.migration_type != migrate_misfit &&
+	    !(env.flags & LBF_LLC_PINNED))
+		sd->nr_balance_failed++;
+
+	if (!need_active_balance(&env))
+		goto out_unbalanced;
+
+	scoped_guard (raw_spin_rq_lock_irqsave, busiest) {
 		/*
-		 * Increment the failure counter only on periodic balance.
-		 * We do not want newidle balance, which can be very
-		 * frequent, pollute the failure counter causing
-		 * excessive cache_hot migrations and active balances.
-		 *
-		 * Similarly for migration_misfit which is not related to
-		 * load/util migration, don't pollute nr_balance_failed.
-		 *
-		 * The same for cache aware scheduling's allowance for
-		 * load imbalance. If regular load balance does not
-		 * migrate task due to LLC locality, it is a expected
-		 * behavior and don't pollute nr_balance_failed.
-		 * See can_migrate_task().
+		 * Don't kick the active_load_balance_cpu_stop,
+		 * if the curr task on busiest CPU can't be
+		 * moved to this_cpu:
 		 */
-		if (idle != CPU_NEWLY_IDLE &&
-		    env.migration_type != migrate_misfit &&
-		    !(env.flags & LBF_LLC_PINNED))
-			sd->nr_balance_failed++;
-
-		if (need_active_balance(&env)) {
-			unsigned long flags;
-
-			raw_spin_rq_lock_irqsave(busiest, flags);
-
-			/*
-			 * Don't kick the active_load_balance_cpu_stop,
-			 * if the curr task on busiest CPU can't be
-			 * moved to this_cpu:
-			 */
-			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
-				raw_spin_rq_unlock_irqrestore(busiest, flags);
-				goto out_one_pinned;
-			}
-
-			/* Record that we found at least one task that could run on this_cpu */
-			env.flags &= ~LBF_ALL_PINNED;
-
-			/*
-			 * ->active_balance synchronizes accesses to
-			 * ->active_balance_work.  Once set, it's cleared
-			 * only after active load balance is finished.
-			 */
-			if (busiest->active_balance)
-				goto no_active_balance;
-
-			/*
-			 * @busiest dropped its rq_lock in the middle of
-			 * scheduling out its ->curr task (->on_rq := 0), no
-			 * need to forcefully punt it away with active balance.
-			 */
-			if (!busiest->curr->on_rq)
-				goto no_active_balance;
-
-			busiest->active_balance = 1;
-			busiest->push_cpu = this_cpu;
-			active_balance = 1;
-no_active_balance:
-			preempt_disable();
-			raw_spin_rq_unlock_irqrestore(busiest, flags);
-			if (active_balance) {
-				stop_one_cpu_nowait(cpu_of(busiest),
-					active_load_balance_cpu_stop, busiest,
-					&busiest->active_balance_work);
-			}
-			preempt_enable();
-		}
-	} else {
-		sd->nr_balance_failed = 0;
+		if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr))
+			goto out_one_pinned;
+
+		/* Record that we found at least one task that could run on this_cpu */
+		env.flags &= ~LBF_ALL_PINNED;
+
+		/*
+		 * ->active_balance synchronizes accesses to
+		 * ->active_balance_work.  Once set, it's cleared
+		 * only after active load balance is finished.
+		 */
+		if (busiest->active_balance)
+			goto out_unbalanced;
+
+		/*
+		 * @busiest dropped its rq_lock in the middle of
+		 * scheduling out its ->curr task (->on_rq := 0), no
+		 * need to forcefully punt it away with active balance.
+		 */
+		if (!busiest->curr->on_rq)
+			goto out_unbalanced;
+
+		busiest->active_balance = 1;
+		busiest->push_cpu = this_cpu;
+		active_balance = 1;
+		preempt_disable();
 	}
+	if (active_balance) {
+		stop_one_cpu_nowait(cpu_of(busiest),
+				    active_load_balance_cpu_stop, busiest,
+				    &busiest->active_balance_work);
+	}
+	preempt_enable();
 
+out_unbalanced:
 	/* We were unbalanced, so reset the balancing interval */
 	sd->balance_interval = sd->min_interval;
-
 	goto out;
 
 out_balanced:
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2018,7 +2018,8 @@ DEFINE_LOCK_GUARD_1(rq_lock, struct rq,
 		    rq_unlock(_T->lock, &_T->rf),
 		    struct rq_flags rf)
 
-DECLARE_LOCK_GUARD_1_ATTRS(rq_lock, __acquires(__rq_lockp(_T)), __releases(__rq_lockp(*(struct rq **)_T)));
+DECLARE_LOCK_GUARD_1_ATTRS(rq_lock, __acquires(__rq_lockp(_T)),
+			   __releases(__rq_lockp(*(struct rq **)_T)));
 #define class_rq_lock_constructor(_T) WITH_LOCK_GUARD_1_ATTRS(rq_lock, _T)
 
 DEFINE_LOCK_GUARD_1(rq_lock_irq, struct rq,
@@ -2026,7 +2027,8 @@ DEFINE_LOCK_GUARD_1(rq_lock_irq, struct
 		    rq_unlock_irq(_T->lock, &_T->rf),
 		    struct rq_flags rf)
 
-DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irq, __acquires(__rq_lockp(_T)), __releases(__rq_lockp(*(struct rq **)_T)));
+DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irq, __acquires(__rq_lockp(_T)),
+			   __releases(__rq_lockp(*(struct rq **)_T)));
 #define class_rq_lock_irq_constructor(_T) WITH_LOCK_GUARD_1_ATTRS(rq_lock_irq, _T)
 
 DEFINE_LOCK_GUARD_1(rq_lock_irqsave, struct rq,
@@ -2034,9 +2036,20 @@ DEFINE_LOCK_GUARD_1(rq_lock_irqsave, str
 		    rq_unlock_irqrestore(_T->lock, &_T->rf),
 		    struct rq_flags rf)
 
-DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irqsave, __acquires(__rq_lockp(_T)), __releases(__rq_lockp(*(struct rq **)_T)));
+DECLARE_LOCK_GUARD_1_ATTRS(rq_lock_irqsave, __acquires(__rq_lockp(_T)),
+			   __releases(__rq_lockp(*(struct rq **)_T)));
 #define class_rq_lock_irqsave_constructor(_T) WITH_LOCK_GUARD_1_ATTRS(rq_lock_irqsave, _T)
 
+DEFINE_LOCK_GUARD_1(raw_spin_rq_lock_irqsave, struct rq,
+		    raw_spin_rq_lock_irqsave(_T->lock, _T->flags),
+		    raw_spin_rq_unlock_irqrestore(_T->lock, _T->flags),
+		    unsigned long flags)
+
+DECLARE_LOCK_GUARD_1_ATTRS(raw_spin_rq_lock_irqsave, __acquires(__rq_lockp(_T)),
+			   __releases(__rq_lockp(*(struct rq **)_T)));
+#define class_raw_spin_rq_lock_irqsave_constructor(_T) \
+	WITH_LOCK_GUARD_1_ATTRS(raw_spin_rq_lock_irqsave, _T)
+
 #define this_rq_lock_irq(...) __acquire_ret(_this_rq_lock_irq(__VA_ARGS__), __rq_lockp(__ret))
 static inline struct rq *_this_rq_lock_irq(struct rq_flags *rf) __acquires_ret
 {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/2] sched/fair: Optimize some active balance logic
  2026-06-18 10:56 ` [PATCH v5 0/2] sched/fair: Optimize some active balance logic Peter Zijlstra
@ 2026-06-18 13:56   ` Xin Zhao
  0 siblings, 0 replies; 12+ messages in thread
From: Xin Zhao @ 2026-06-18 13:56 UTC (permalink / raw)
  To: peterz
  Cc: aiqun.yu, bsegall, dietmar.eggemann, jackzxcui1989, juri.lelli,
	kprateek.nayak, linux-kernel, mgorman, mingo, pauld, rostedt,
	vincent.guittot, vschneid

On Thu, 18 Jun 2026 12:56:27 +0200 Peter Zijlstra <peterz@infradead.org> wrote:

> And since I've been staring at this code far too long, I accidentally
> did the below cleanup on top.
> 
> 
> ---
> Subject: sched/fair: Reflow sched_balance_rq()
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Thu Jun 18 10:51:49 CEST 2026
> 
> Reflow to reduce indenting.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/sched/fair.c  |  136 ++++++++++++++++++++++++---------------------------
>  kernel/sched/sched.h |   19 ++++++-
>  2 files changed, 82 insertions(+), 73 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -13437,82 +13437,78 @@ static int sched_balance_rq(int this_cpu
>  		}
>  	}

It looks great after the modifications.


Thanks
Xin Zhao


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-18 13:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17  7:21 [PATCH v5 0/2] sched/fair: Optimize some active balance logic Xin Zhao
2026-06-17  7:21 ` [PATCH v5 1/2] sched/fair: Don't trigger active lb if src_rq->curr is not on_rq Xin Zhao
2026-06-17  9:30   ` Valentin Schneider
2026-06-18  9:18   ` Peter Zijlstra
2026-06-18 10:09     ` Xin Zhao
2026-06-17  7:21 ` [PATCH v5 2/2] sched/fair: Simplify balance_interval reset logic in sched_balance_rq() Xin Zhao
2026-06-18  9:40   ` Peter Zijlstra
2026-06-18 10:17     ` Xin Zhao
2026-06-18 10:31       ` Peter Zijlstra
2026-06-18 10:49         ` Xin Zhao
2026-06-18 10:56 ` [PATCH v5 0/2] sched/fair: Optimize some active balance logic Peter Zijlstra
2026-06-18 13:56   ` Xin Zhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.