From: 'Tejun Heo' <tj@kernel.org>
To: liuwenfang <liuwenfang@honor.com>
Cc: 'David Vernet' <void@manifault.com>,
'Andrea Righi' <arighi@nvidia.com>,
'Changwoo Min' <changwoo@igalia.com>,
'Ingo Molnar' <mingo@redhat.com>,
'Peter Zijlstra' <peterz@infradead.org>,
'Juri Lelli' <juri.lelli@redhat.com>,
'Vincent Guittot' <vincent.guittot@linaro.org>,
'Dietmar Eggemann' <dietmar.eggemann@arm.com>,
'Steven Rostedt' <rostedt@goodmis.org>,
'Ben Segall' <bsegall@google.com>, 'Mel Gorman' <mgorman@suse.de>,
'Valentin Schneider' <vschneid@redhat.com>,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched_ext: Fix cpu_released while RT task and SCX task are scheduled concurrently
Date: Mon, 23 Jun 2025 09:50:55 -1000 [thread overview]
Message-ID: <aFmwHzO2AKFXO_YS@slm.duckdns.org> (raw)
In-Reply-To: <fca528bb34394de3a7e87a873fadd9df@honor.com>
Hello,
On Sat, Jun 21, 2025 at 04:09:55AM +0000, liuwenfang wrote:
> Supposed RT task(rt1) is running on one CPU with its rq->scx.cpu_released
> set to true, if the rt1 becomes sleeping, then the scheduler will balance
> the remote SCX task(scx1) because there is no other RT task on its rq,
> and rq->scx.cpu_released is false. While one RT task(rt2) is placed on
> this rq(maybe rt2 wakeup or migration occurs) before the scx1 is enqueued,
> then the scheduler will pick rt2. At last, rt2 will be running on this cpu
> with rq->scx.cpu_released being false!
> The main reason is that consume_remote_task() will unlock rq lock.
This is rather difficult to follow. Can you please break this down to a
table? People often use a format like the following:
CPU X CPU Y
A does something
B does something else
...
...
Boom
> @@ -2470,6 +2471,11 @@ static inline void put_prev_set_next_task(struct rq *rq,
>
> prev->sched_class->put_prev_task(rq, prev, next);
> next->sched_class->set_next_task(rq, next, true);
> +
> +#ifdef CONFIG_SCHED_CLASS_EXT
> + if (scx_enabled())
> + switch_class(rq, next);
> +#endif
You're right that there is a race condition around this and I can't see a
way to solve this in SCX proper as there's no way for balance() to tell
whether a higher priority sched class has queued something while balance()
dropped the rq lock for migration, so adding a hook to
put_prev_set_next_task() seems like a reasoanble solution. However, can you
please do the followings?
- Improve the description so that the race condition is clearly
understandable and explain why the extra hook in put_prev_set_next_task()
is necessary.
- Rename switch_class() to something which fits the new location better -
maybe scx_put_prev_set_next_task().
- If the function is called from put_prev_set_next_task(), it doesn't need
to be called from put_prev_task_scx(). Drop that call.
Thanks.
--
tejun
next prev parent reply other threads:[~2025-06-23 19:50 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-21 4:09 [PATCH] sched_ext: Fix cpu_released while RT task and SCX task are scheduled concurrently liuwenfang
2025-06-23 19:50 ` 'Tejun Heo' [this message]
2025-06-28 6:50 ` [PATCH v2 1/2] " liuwenfang
2025-07-17 21:38 ` 'Tejun Heo'
2025-07-20 9:20 ` liuwenfang
2025-07-20 9:38 ` [PATCH v3 2/3] " liuwenfang
2025-08-12 1:26 ` 'Tejun Heo'
2025-07-20 9:41 ` [PATCH v3 3/3] sched_ext: Fix cpu_released while changing sched policy of the running task liuwenfang
2025-08-12 1:31 ` 'Tejun Heo'
2025-08-19 6:52 ` [PATCH v4 1/3] sched_ext: Fix pnt_seq calculation when picking the next task liuwenfang
2025-08-19 6:55 ` [PATCH v4 2/3] sched_ext: Fix cpu_released while RT task and SCX task are scheduled concurrently liuwenfang
2025-08-19 7:07 ` [PATCH v4 3/3] sched_ext: Fix cpu_released while changing sched policy of the running task liuwenfang
2025-08-19 7:47 ` [PATCH v4 2/3] sched_ext: Fix cpu_released while RT task and SCX task are scheduled concurrently Peter Zijlstra
2025-08-19 8:47 ` 回复: " liuwenfang
2025-08-19 10:08 ` Peter Zijlstra
2025-08-20 0:28 ` 'Tejun Heo'
2025-08-20 9:18 ` Peter Zijlstra
2025-08-20 16:52 ` 'Tejun Heo'
2025-06-28 7:20 ` [PATCH v2 2/2] sched_ext: Fix cpu_released while changing sched policy of the running task liuwenfang
2025-07-17 21:48 ` 'Tejun Heo'
2025-07-18 9:06 ` liuwenfang
2025-07-20 9:36 ` [PATCH v3 1/3] sched_ext: Fix pnt_seq calculation liuwenfang
2025-08-12 0:03 ` 'Tejun Heo'
2025-08-12 0:30 ` 'Tejun Heo'
2025-08-18 10:45 ` liuwenfang
2025-08-18 17:43 ` 'Tejun Heo'
2025-08-19 7:41 ` liuwenfang
2025-08-18 17:47 ` Peter Zijlstra
2025-08-19 7:36 ` liuwenfang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFmwHzO2AKFXO_YS@slm.duckdns.org \
--to=tj@kernel.org \
--cc=arighi@nvidia.com \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liuwenfang@honor.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.