From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>
Cc: sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
Emil Tsalapatis <emil@etsalapatis.com>,
Chris Mason <clm@meta.com>, Ryan Newton <newton@meta.com>,
Tejun Heo <tj@kernel.org>
Subject: [PATCH v2 13/13] sched_ext: Refuse cross-task select_cpu_from_kfunc calls
Date: Fri, 24 Apr 2026 14:19:42 -1000 [thread overview]
Message-ID: <20260425001942.3987351-1-tj@kernel.org> (raw)
In-Reply-To: <20260424204418.3809733-14-tj@kernel.org>
select_cpu_from_kfunc() skipped pi_lock for @p when called from
ops.select_cpu() or another rq-locked SCX op, assuming the held lock
protects @p. scx_bpf_select_cpu_dfl() / __scx_bpf_select_cpu_and() accept an
arbitrary KF_RCU task_struct, so a caller in e.g. ops.select_cpu(p1) or
ops.enqueue(p1) can pass some other p2 - the held pi_lock / rq lock is p1's,
not p2's - and reading p2->cpus_ptr / nr_cpus_allowed races with
set_cpus_allowed_ptr() and migrate_disable_switch() on another CPU.
Abort the scheduler on cross-task calls in both branches: for
ops.select_cpu() use scx_kf_arg_task_ok() to verify @p is the wake-up
task recorded in current->scx.kf_tasks[] by SCX_CALL_OP_TASK_RET();
for other rq-locked SCX ops compare task_rq(p) against scx_locked_rq().
v2: Per Andrea Righi: switch the in_select_cpu cross-task check from
direct_dispatch_task comparison to scx_kf_arg_task_ok(). The former
spuriously rejects when ops.select_cpu() calls scx_bpf_dsq_insert()
first (mark_direct_dispatch() stamps direct_dispatch_task =
ERR_PTR(-ESRCH)), then calls scx_bpf_select_cpu_*() on the same task.
Fixes: 0022b328504d ("sched_ext: Decouple kfunc unlocked-context check from kf_mask")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrea Righi <arighi@nvidia.com>
---
kernel/sched/ext_idle.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
index c43d62d90e40..7468560a6d80 100644
--- a/kernel/sched/ext_idle.c
+++ b/kernel/sched/ext_idle.c
@@ -927,14 +927,24 @@ static s32 select_cpu_from_kfunc(struct scx_sched *sch, struct task_struct *p,
* Accessing p->cpus_ptr / p->nr_cpus_allowed needs either @p's rq
* lock or @p's pi_lock. Three cases:
*
- * - inside ops.select_cpu(): try_to_wake_up() holds @p's pi_lock.
+ * - inside ops.select_cpu(): try_to_wake_up() holds the wake-up
+ * task's pi_lock; the wake-up task is recorded in kf_tasks[0]
+ * by SCX_CALL_OP_TASK_RET().
* - other rq-locked SCX op: scx_locked_rq() points at the held rq.
* - truly unlocked (UNLOCKED ops, SYSCALL, non-SCX struct_ops):
* nothing held, take pi_lock ourselves.
+ *
+ * In the first two cases, BPF schedulers may pass an arbitrary task
+ * that the held lock doesn't cover. Refuse those.
*/
if (this_rq()->scx.in_select_cpu) {
+ if (!scx_kf_arg_task_ok(sch, p))
+ return -EINVAL;
lockdep_assert_held(&p->pi_lock);
- } else if (!scx_locked_rq()) {
+ } else if (scx_locked_rq()) {
+ if (task_rq(p) != scx_locked_rq())
+ goto cross_task;
+ } else {
raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
we_locked = true;
}
@@ -960,6 +970,11 @@ static s32 select_cpu_from_kfunc(struct scx_sched *sch, struct task_struct *p,
raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
return cpu;
+
+cross_task:
+ scx_error(sch, "select_cpu kfunc called cross-task on %s[%d]",
+ p->comm, p->pid);
+ return -EINVAL;
}
/**
--
2.53.0
next prev parent reply other threads:[~2026-04-25 0:19 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 20:44 [PATCHSET sched_ext/for-7.1-fixes] sched_ext: Assorted fixes Tejun Heo
2026-04-24 20:44 ` [PATCH 01/13] sched_ext: Unregister sub_kset on scheduler disable Tejun Heo
2026-04-24 20:44 ` [PATCH 02/13] sched_ext: Guard scx_dsq_move() against NULL kit->dsq after failed iter_new Tejun Heo
2026-04-24 20:44 ` [PATCH 03/13] sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu() Tejun Heo
2026-04-24 20:44 ` [PATCH 04/13] sched_ext: Don't disable tasks in scx_sub_enable_workfn() abort path Tejun Heo
2026-04-24 20:44 ` [PATCH 05/13] sched_ext: Read scx_root under scx_cgroup_ops_rwsem in cgroup setters Tejun Heo
2026-04-24 20:44 ` [PATCH 06/13] sched_ext: Resolve caller's scheduler in scx_bpf_destroy_dsq() / scx_bpf_dsq_nr_queued() Tejun Heo
2026-04-24 20:44 ` [PATCH 07/13] sched_ext: Use dsq->first_task instead of list_empty() in dispatch_enqueue() FIFO-tail Tejun Heo
2026-04-24 20:44 ` [PATCH 08/13] sched_ext: Save and restore scx_locked_rq across SCX_CALL_OP Tejun Heo
2026-04-24 20:44 ` [PATCH 09/13] sched_ext: Pass held rq to SCX_CALL_OP() for dump_cpu/dump_task Tejun Heo
2026-04-24 20:44 ` [PATCH 10/13] sched_ext: Pass held rq to SCX_CALL_OP() for core_sched_before Tejun Heo
2026-04-24 20:44 ` [PATCH 11/13] sched_ext: Make bypass LB cpumasks per-scheduler Tejun Heo
2026-04-24 20:44 ` [PATCH 12/13] sched_ext: Align cgroup #ifdef guards with SUB_SCHED vs GROUP_SCHED Tejun Heo
2026-04-24 20:44 ` [PATCH 13/13] sched_ext: Refuse cross-task select_cpu_from_kfunc calls Tejun Heo
2026-04-24 21:46 ` Andrea Righi
2026-04-25 0:19 ` Tejun Heo [this message]
2026-04-25 6:50 ` [PATCH v2 " Andrea Righi
2026-04-24 21:08 ` [PATCH 14/13] sched_ext: Reject NULL-sch callers in scx_bpf_task_set_slice/dsq_vtime Tejun Heo
2026-04-24 21:08 ` [PATCH 15/13] sched_ext: Release cpus_read_lock on scx_link_sched() failure in root enable Tejun Heo
2026-04-24 22:00 ` Andrea Righi
2026-04-25 0:19 ` [PATCH v2 " Tejun Heo
2026-04-25 6:51 ` Andrea Righi
2026-04-24 22:10 ` [PATCHSET sched_ext/for-7.1-fixes] sched_ext: Assorted fixes Andrea Righi
2026-04-25 0:39 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260425001942.3987351-1-tj@kernel.org \
--to=tj@kernel.org \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=clm@meta.com \
--cc=emil@etsalapatis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=newton@meta.com \
--cc=sched-ext@lists.linux.dev \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.