From: Chengming Zhou <chengming.zhou@linux.dev>
To: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 02/12] sched_ext: Avoid NULL scx_root deref through SCX_HAS_OP()
Date: Thu, 24 Apr 2025 15:23:40 +0800 [thread overview]
Message-ID: <b9814fec-a9b6-4cd5-a0b1-1c2ddb214a03@linux.dev> (raw)
In-Reply-To: <20250423234542.1890867-3-tj@kernel.org>
On 2025/4/24 07:44, Tejun Heo wrote:
> SCX_HAS_OP() tests scx_root->has_op bitmap. The bitmap is currently in a
> statically allocated struct scx_sched and initialized while loading the BPF
> scheduler and cleared while unloading, and thus can be tested anytime.
> However, scx_root will be switched to dynamic allocation and thus won't
> always be deferenceable.
>
> Most usages of SCX_HAS_OP() are already protected by scx_enabled() either
> directly or indirectly (e.g. through a task which is on SCX). However, there
> are a couple places that could try to dereference NULL scx_root. Update them
> so that scx_root is guaranteed to be valid before SCX_HAS_OP() is called.
>
> - In handle_hotplug(), test whether scx_root is NULL before doing anything
> else. This is safe because scx_root updates will be protected by
> cpus_read_lock().
>
> - In scx_tg_offline(), test scx_cgroup_enabled before invoking SCX_HAS_OP(),
> which should guarnatee that scx_root won't turn NULL. This is also in line
> with other cgroup operations. As the code path is synchronized against
> scx_cgroup_init/exit() through scx_cgroup_rwsem, this shouldn't cause any
> behavior differences.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
> kernel/sched/ext.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 975f6963a01b..ad392890d2dd 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3498,6 +3498,14 @@ static void handle_hotplug(struct rq *rq, bool online)
>
> atomic_long_inc(&scx_hotplug_seq);
>
> + /*
> + * scx_root updates are protected by cpus_read_lock() and will stay
> + * stable here. Note that we can't depend on scx_enabled() test as the
> + * hotplug ops need to be enabled before __scx_enabled is set.
> + */
> + if (!scx_root)
> + return;
> +
> if (scx_enabled())
> scx_idle_update_selcpu_topology(&scx_root->ops);
Just be curious, does the comments added above mean we shouldn't
check scx_enabled() here anymore?
Thanks!
>
> @@ -3994,7 +4002,8 @@ void scx_tg_offline(struct task_group *tg)
>
> percpu_down_read(&scx_cgroup_rwsem);
>
> - if (SCX_HAS_OP(scx_root, cgroup_exit) && (tg->scx_flags & SCX_TG_INITED))
> + if (scx_cgroup_enabled && SCX_HAS_OP(scx_root, cgroup_exit) &&
> + (tg->scx_flags & SCX_TG_INITED))
> SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, NULL, tg->css.cgroup);
> tg->scx_flags &= ~(SCX_TG_ONLINE | SCX_TG_INITED);
>
next prev parent reply other threads:[~2025-04-24 7:23 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-23 23:44 [PATCHSET sched_ext/for-6.16] sched_ext: Introduce scx_sched Tejun Heo
2025-04-23 23:44 ` [PATCH 01/12] " Tejun Heo
2025-04-23 23:44 ` [PATCH 02/12] sched_ext: Avoid NULL scx_root deref through SCX_HAS_OP() Tejun Heo
2025-04-24 7:23 ` Chengming Zhou [this message]
2025-04-24 18:55 ` Tejun Heo
2025-04-23 23:44 ` [PATCH 03/12] sched_ext: Use dynamic allocation for scx_sched Tejun Heo
2025-04-25 10:14 ` Andrea Righi
2025-04-25 19:48 ` Tejun Heo
2025-04-23 23:44 ` [PATCH 04/12] sched_ext: Inline create_dsq() into scx_bpf_create_dsq() Tejun Heo
2025-04-23 23:44 ` [PATCH 05/12] sched_ext: Factor out scx_alloc_and_add_sched() Tejun Heo
2025-04-23 23:44 ` [PATCH 06/12] sched_ext: Move dsq_hash into scx_sched Tejun Heo
2025-04-23 23:44 ` [PATCH 07/12] sched_ext: Move global_dsqs " Tejun Heo
2025-04-23 23:44 ` [PATCH 08/12] sched_ext: Relocate scx_event_stats definition Tejun Heo
2025-04-23 23:44 ` [PATCH 09/12] sched_ext: Factor out scx_read_events() Tejun Heo
2025-04-23 23:44 ` [PATCH 10/12] sched_ext: Move event_stats_cpu into scx_sched Tejun Heo
2025-04-25 5:38 ` Changwoo Min
2025-04-23 23:44 ` [PATCH 11/12] sched_ext: Move disable machinery " Tejun Heo
2025-04-23 23:44 ` [PATCH 12/12] sched_ext: Clean up SCX_EXIT_NONE handling in scx_disable_workfn() Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2025-04-25 21:58 [PATCHSET v2 sched_ext/for-6.16] sched_ext: Introduce scx_sched Tejun Heo
2025-04-25 21:58 ` [PATCH 02/12] sched_ext: Avoid NULL scx_root deref through SCX_HAS_OP() Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b9814fec-a9b6-4cd5-a0b1-1c2ddb214a03@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.