* [PATCH v2] sched/core: Fix RQCF_ACT_SKIP leak
@ 2023-10-12 9:00 Hao Jia
2023-10-20 10:46 ` Peter Zijlstra
0 siblings, 1 reply; 2+ messages in thread
From: Hao Jia @ 2023-10-12 9:00 UTC (permalink / raw)
To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid
Cc: linux-kernel, Hao Jia, stable, Igor Raits, Bagas Sanjaya
Igor Raits and Bagas Sanjaya report a RQCF_ACT_SKIP leak warning.
Link: https://lore.kernel.org/all/a5dd536d-041a-2ce9-f4b7-64d8d85c86dc@gmail.com
This warning may be triggered in the following situations:
CPU0 CPU1
__schedule()
*rq->clock_update_flags <<= 1;* unregister_fair_sched_group()
pick_next_task_fair+0x4a/0x410 destroy_cfs_bandwidth()
newidle_balance+0x115/0x3e0 for_each_possible_cpu(i) *i=0*
rq_unpin_lock(this_rq, rf) __cfsb_csd_unthrottle()
raw_spin_rq_unlock(this_rq)
rq_lock(*CPU0_rq*, &rf)
rq_clock_start_loop_update()
rq->clock_update_flags & RQCF_ACT_SKIP <--
raw_spin_rq_lock(this_rq)
The purpose of RQCF_ACT_SKIP is to skip the update rq clock,
but the update is very early in __schedule(), but we clear
RQCF_*_SKIP very late, causing it to span that gap above
and triggering this warning.
In __schedule() we can clear the RQCF_*_SKIP flag immediately
after update_rq_clock() to avoid this RQCF_ACT_SKIP leak warning.
And set rq->clock_update_flags to RQCF_UPDATED to avoid
rq->clock_update_flags < RQCF_ACT_SKIP warning that may be triggered later.
Fixes: ebb83d84e49b ("sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()")
Cc: stable@vger.kernel.org
Reported-by: Igor Raits <igor.raits@gmail.com>
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Closes: https://lore.kernel.org/all/20230913082424.73252-1-jiahao.os@bytedance.com
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
---
kernel/sched/core.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 802551e0009b..afb8d213155b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5374,8 +5374,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
/* switch_mm_cid() requires the memory barriers above. */
switch_mm_cid(rq, prev, next);
- rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
-
prepare_lock_switch(rq, next, rf);
/* Here we just switch the register state and the stack. */
@@ -6615,6 +6613,7 @@ static void __sched notrace __schedule(unsigned int sched_mode)
/* Promote REQ to ACT */
rq->clock_update_flags <<= 1;
update_rq_clock(rq);
+ rq->clock_update_flags = RQCF_UPDATED;
switch_count = &prev->nivcsw;
@@ -6694,8 +6693,6 @@ static void __sched notrace __schedule(unsigned int sched_mode)
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
} else {
- rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
-
rq_unpin_lock(rq, &rf);
__balance_callbacks(rq);
raw_spin_rq_unlock_irq(rq);
--
2.39.2
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] sched/core: Fix RQCF_ACT_SKIP leak
2023-10-12 9:00 [PATCH v2] sched/core: Fix RQCF_ACT_SKIP leak Hao Jia
@ 2023-10-20 10:46 ` Peter Zijlstra
0 siblings, 0 replies; 2+ messages in thread
From: Peter Zijlstra @ 2023-10-20 10:46 UTC (permalink / raw)
To: Hao Jia
Cc: mingo, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, linux-kernel,
stable, Igor Raits, Bagas Sanjaya
On Thu, Oct 12, 2023 at 05:00:03PM +0800, Hao Jia wrote:
> Igor Raits and Bagas Sanjaya report a RQCF_ACT_SKIP leak warning.
> Link: https://lore.kernel.org/all/a5dd536d-041a-2ce9-f4b7-64d8d85c86dc@gmail.com
>
> This warning may be triggered in the following situations:
>
> CPU0 CPU1
>
> __schedule()
> *rq->clock_update_flags <<= 1;* unregister_fair_sched_group()
> pick_next_task_fair+0x4a/0x410 destroy_cfs_bandwidth()
> newidle_balance+0x115/0x3e0 for_each_possible_cpu(i) *i=0*
> rq_unpin_lock(this_rq, rf) __cfsb_csd_unthrottle()
> raw_spin_rq_unlock(this_rq)
> rq_lock(*CPU0_rq*, &rf)
> rq_clock_start_loop_update()
> rq->clock_update_flags & RQCF_ACT_SKIP <--
> raw_spin_rq_lock(this_rq)
>
> The purpose of RQCF_ACT_SKIP is to skip the update rq clock,
> but the update is very early in __schedule(), but we clear
> RQCF_*_SKIP very late, causing it to span that gap above
> and triggering this warning.
>
> In __schedule() we can clear the RQCF_*_SKIP flag immediately
> after update_rq_clock() to avoid this RQCF_ACT_SKIP leak warning.
> And set rq->clock_update_flags to RQCF_UPDATED to avoid
> rq->clock_update_flags < RQCF_ACT_SKIP warning that may be triggered later.
>
Thanks!
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-10-20 10:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 9:00 [PATCH v2] sched/core: Fix RQCF_ACT_SKIP leak Hao Jia
2023-10-20 10:46 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox