* [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-09-02 11:11 Christian Loehle
2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
Christian Loehle
scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
This also simplifies scx code from:
rq = scx_bpf_cpu_rq(cpu);
if (!rq)
return;
p = rq->curr
/* ... Do something with p */
into:
p = scx_bpf_remote_curr(cpu);
/* ... Do something with p */
Changes since:
v5:
https://lore.kernel.org/lkml/20250901132605.2282650-2-christian.loehle@arm.com/
- Actually expose the RCU pointer in scx_bpf_remote_curr() as such (Andrea)
v4:
https://lore.kernel.org/lkml/20250811212150.85759-1-christian.loehle@arm.com/
- Remove cpu argument from scx_bpf_cpu_rq_locked() as SCX has a unique
locked_rq_state anyway. (Tejun)
- Expose RCU pointer in scx_bpf_remote_curr() (Peter)
v3:
https://lore.kernel.org/lkml/20250805111036.130121-1-christian.loehle@arm.com/
- Don't change scx_bpf_cpu_rq() do not break BPF schedulers without the
grace period. Just add the deprecation warning and do the hardening in
the new scx_bpf_cpu_rq_locked(). (Andrea, Tejun, Jake)
v2:
https://lore.kernel.org/lkml/20250804112743.711816-1-christian.loehle@arm.com/
- Open-code bpf_task_acquire() to avoid the forward declaration (Andrea)
- Rename scx_bpf_task_acquire_remote_curr() to make it more explicit it
behaves like bpf_task_acquire()
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)
Christian Loehle (3):
sched_ext: Introduce scx_bpf_cpu_rq_locked()
sched_ext: Introduce scx_bpf_remote_curr()
sched_ext: deprecation warn for scx_bpf_cpu_rq()
kernel/sched/ext.c | 40 ++++++++++++++++++++++++
tools/sched_ext/include/scx/common.bpf.h | 2 ++
2 files changed, 42 insertions(+)
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked()
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
Christian Loehle
Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Make a safer version of scx_bpf_cpu_rq() that only returns a rq
if we hold rq lock of that rq.
Also mark the new scx_bpf_cpu_rq_locked() as returning NULL.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 23 +++++++++++++++++++++++
tools/sched_ext/include/scx/common.bpf.h | 1 +
2 files changed, 24 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 4ae32ef179dd..9fcc310d85d5 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7430,6 +7430,28 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
return cpu_rq(cpu);
}
+/**
+ * scx_bpf_cpu_rq_locked - Return the rq currently locked by SCX
+ *
+ * Returns the rq if a rq lock is currently held by SCX.
+ * Otherwise emits an error and returns NULL.
+ */
+__bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(void)
+{
+ struct rq *rq;
+
+ preempt_disable();
+ rq = scx_locked_rq();
+ if (!rq) {
+ preempt_enable();
+ scx_kf_error("accessing rq without holding rq lock");
+ return NULL;
+ }
+ preempt_enable();
+
+ return rq;
+}
+
/**
* scx_bpf_task_cgroup - Return the sched cgroup of a task
* @p: task of interest
@@ -7594,6 +7616,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
#ifdef CONFIG_CGROUP_SCHED
BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
#endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..f5be06c93359 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
bool scx_bpf_task_running(const struct task_struct *p) __ksym;
s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct rq *scx_bpf_cpu_rq_locked(void) __ksym;
struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
u64 scx_bpf_now(void) __ksym __weak;
void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr()
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
Christian Loehle
Provide scx_bpf_remote_curr() as a way for scx schedulers to check the curr
task of a remote rq without assuming its lock is held.
Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
(e.g. to see if it should be preempted). This is problematic because
scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
which aren't safe to use without holding the associated rq lock.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 14 ++++++++++++++
tools/sched_ext/include/scx/common.bpf.h | 1 +
2 files changed, 15 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 9fcc310d85d5..dc141144bfd6 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7452,6 +7452,19 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(void)
return rq;
}
+/**
+ * scx_bpf_remote_curr - Return remote CPU's curr task
+ * @cpu: CPU of interest
+ *
+ * Callers must hold RCU read lock (KF_RCU).
+ */
+__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
+{
+ if (!kf_cpu_valid(cpu, NULL))
+ return NULL;
+ return rcu_dereference(cpu_rq(cpu)->curr);
+}
+
/**
* scx_bpf_task_cgroup - Return the sched cgroup of a task
* @p: task of interest
@@ -7617,6 +7630,7 @@ BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_RCU)
#ifdef CONFIG_CGROUP_SCHED
BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
#endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index f5be06c93359..dd3d94256c10 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -92,6 +92,7 @@ bool scx_bpf_task_running(const struct task_struct *p) __ksym;
s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
struct rq *scx_bpf_cpu_rq_locked(void) __ksym;
+struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
u64 scx_bpf_now(void) __ksym __weak;
void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq()
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
Christian Loehle
scx_bpf_cpu_rq() works on an unlocked rq which generally isn't safe.
For the common use-cases scx_bpf_cpu_rq_locked() and
scx_bpf_remote_curr() work, so add a deprecation warning
to scx_bpf_cpu_rq() so it can eventually be removed.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index dc141144bfd6..987c7dc38545 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7427,6 +7427,9 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
if (!kf_cpu_valid(cpu, NULL))
return NULL;
+ pr_warn_once("%s() is deprecated; use scx_bpf_cpu_rq_locked() when holding rq lock "
+ "or scx_bpf_remote_curr() to read remote curr safely.\n", __func__);
+
return cpu_rq(cpu);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
` (2 preceding siblings ...)
2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
@ 2025-09-02 11:58 ` Andrea Righi
2025-09-02 13:53 ` Christian Loehle
3 siblings, 1 reply; 7+ messages in thread
From: Andrea Righi @ 2025-09-02 11:58 UTC (permalink / raw)
To: Christian Loehle
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz, jake
On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> holding the associated rq.
> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
>
> This also simplifies scx code from:
>
> rq = scx_bpf_cpu_rq(cpu);
> if (!rq)
> return;
> p = rq->curr
> /* ... Do something with p */
>
> into:
>
> p = scx_bpf_remote_curr(cpu);
> /* ... Do something with p */
This looks good to me.
We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
BPF schedulers can be updated to use this new kfunc without breaking the
compatibility with older kernels, but we can do this later, I'll send a
follow-up patch. For now:
Acked-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
@ 2025-09-02 13:53 ` Christian Loehle
2025-09-02 14:10 ` Andrea Righi
0 siblings, 1 reply; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 13:53 UTC (permalink / raw)
To: Andrea Righi
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz, jake
On 9/2/25 12:58, Andrea Righi wrote:
> On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
>> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
>> holding the associated rq.
>> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
>> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
>> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
>> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
>> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
>>
>> This also simplifies scx code from:
>>
>> rq = scx_bpf_cpu_rq(cpu);
>> if (!rq)
>> return;
>> p = rq->curr
>> /* ... Do something with p */
>>
>> into:
>>
>> p = scx_bpf_remote_curr(cpu);
>> /* ... Do something with p */
>
> This looks good to me.
>
> We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
> BPF schedulers can be updated to use this new kfunc without breaking the
> compatibility with older kernels, but we can do this later, I'll send a
> follow-up patch. For now:
>
> Acked-by: Andrea Righi <arighi@nvidia.com>
Thanks!
I'd have the compat patch ready as well and would send it out in a bit.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
2025-09-02 13:53 ` Christian Loehle
@ 2025-09-02 14:10 ` Andrea Righi
0 siblings, 0 replies; 7+ messages in thread
From: Andrea Righi @ 2025-09-02 14:10 UTC (permalink / raw)
To: Christian Loehle
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz, jake
On Tue, Sep 02, 2025 at 02:53:56PM +0100, Christian Loehle wrote:
> On 9/2/25 12:58, Andrea Righi wrote:
> > On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
> >> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> >> holding the associated rq.
> >> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> >> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> >> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
> >> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
> >> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
> >>
> >> This also simplifies scx code from:
> >>
> >> rq = scx_bpf_cpu_rq(cpu);
> >> if (!rq)
> >> return;
> >> p = rq->curr
> >> /* ... Do something with p */
> >>
> >> into:
> >>
> >> p = scx_bpf_remote_curr(cpu);
> >> /* ... Do something with p */
> >
> > This looks good to me.
> >
> > We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
> > BPF schedulers can be updated to use this new kfunc without breaking the
> > compatibility with older kernels, but we can do this later, I'll send a
> > follow-up patch. For now:
> >
> > Acked-by: Andrea Righi <arighi@nvidia.com>
>
> Thanks!
> I'd have the compat patch ready as well and would send it out in a bit.
Awesome, I was thinking about something like the following (untested).
Feel free to include this in your patch.
Thanks,
-Andrea
tools/sched_ext/include/scx/compat.bpf.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/sched_ext/include/scx/compat.bpf.h b/tools/sched_ext/include/scx/compat.bpf.h
index 36e0cd2fd4eda..67594ff99a461 100644
--- a/tools/sched_ext/include/scx/compat.bpf.h
+++ b/tools/sched_ext/include/scx/compat.bpf.h
@@ -230,6 +230,15 @@ static inline bool __COMPAT_is_enq_cpu_selected(u64 enq_flags)
scx_bpf_pick_any_cpu_node(cpus_allowed, node, flags) : \
scx_bpf_pick_any_cpu(cpus_allowed, flags))
+/*
+ * v6.18: Add a helper to retrieve the current task from a runqueue.
+ *
+ * Keep this macro available until v6.20 for compatibility.
+ */
+#define __COMPAT_scx_bpf_remote_curr(cpu) \
+ (bpf_ksym_exists(scx_bpf_remote_curr) ? \
+ scx_bpf_remote_curr(cpu) : scx_bpf_cpu_rq(cpu)->curr)
+
/*
* Define sched_ext_ops. This may be expanded to define multiple variants for
* backward compatibility. See compat.h::SCX_OPS_LOAD/ATTACH().
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-02 14:10 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
2025-09-02 13:53 ` Christian Loehle
2025-09-02 14:10 ` Andrea Righi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).