linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-09-02 11:11 Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.

This also simplifies scx code from:

rq = scx_bpf_cpu_rq(cpu);
if (!rq)
	return;
p = rq->curr
/* ... Do something with p */

into:

p = scx_bpf_remote_curr(cpu);
/* ... Do something with p */

Changes since:
v5:
https://lore.kernel.org/lkml/20250901132605.2282650-2-christian.loehle@arm.com/
- Actually expose the RCU pointer in scx_bpf_remote_curr() as such (Andrea)
v4:
https://lore.kernel.org/lkml/20250811212150.85759-1-christian.loehle@arm.com/
- Remove cpu argument from scx_bpf_cpu_rq_locked() as SCX has a unique
locked_rq_state anyway. (Tejun)
- Expose RCU pointer in scx_bpf_remote_curr() (Peter)
v3:
https://lore.kernel.org/lkml/20250805111036.130121-1-christian.loehle@arm.com/
- Don't change scx_bpf_cpu_rq() do not break BPF schedulers without the
grace period. Just add the deprecation warning and do the hardening in
the new scx_bpf_cpu_rq_locked(). (Andrea, Tejun, Jake)
v2:
https://lore.kernel.org/lkml/20250804112743.711816-1-christian.loehle@arm.com/
- Open-code bpf_task_acquire() to avoid the forward declaration (Andrea)
- Rename scx_bpf_task_acquire_remote_curr() to make it more explicit it
behaves like bpf_task_acquire()
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)

Christian Loehle (3):
  sched_ext: Introduce scx_bpf_cpu_rq_locked()
  sched_ext: Introduce scx_bpf_remote_curr()
  sched_ext: deprecation warn for scx_bpf_cpu_rq()

 kernel/sched/ext.c                       | 40 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  2 ++
 2 files changed, 42 insertions(+)

--
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked()
  2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Make a safer version of scx_bpf_cpu_rq() that only returns a rq
if we hold rq lock of that rq.

Also mark the new scx_bpf_cpu_rq_locked() as returning NULL.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 23 +++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 4ae32ef179dd..9fcc310d85d5 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7430,6 +7430,28 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	return cpu_rq(cpu);
 }
 
+/**
+ * scx_bpf_cpu_rq_locked - Return the rq currently locked by SCX
+ *
+ * Returns the rq if a rq lock is currently held by SCX.
+ * Otherwise emits an error and returns NULL.
+ */
+__bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(void)
+{
+	struct rq *rq;
+
+	preempt_disable();
+	rq = scx_locked_rq();
+	if (!rq) {
+		preempt_enable();
+		scx_kf_error("accessing rq without holding rq lock");
+		return NULL;
+	}
+	preempt_enable();
+
+	return rq;
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7594,6 +7616,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..f5be06c93359 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct rq *scx_bpf_cpu_rq_locked(void) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr()
  2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
  2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
  3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

Provide scx_bpf_remote_curr() as a way for scx schedulers to check the curr
task of a remote rq without assuming its lock is held.

Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
(e.g. to see if it should be preempted). This is problematic because
scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
which aren't safe to use without holding the associated rq lock.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 14 ++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 9fcc310d85d5..dc141144bfd6 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7452,6 +7452,19 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(void)
 	return rq;
 }
 
+/**
+ * scx_bpf_remote_curr - Return remote CPU's curr task
+ * @cpu: CPU of interest
+ *
+ * Callers must hold RCU read lock (KF_RCU).
+ */
+__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
+{
+	if (!kf_cpu_valid(cpu, NULL))
+		return NULL;
+	return rcu_dereference(cpu_rq(cpu)->curr);
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7617,6 +7630,7 @@ BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_RCU)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index f5be06c93359..dd3d94256c10 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -92,6 +92,7 @@ bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
 struct rq *scx_bpf_cpu_rq_locked(void) __ksym;
+struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq()
  2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
  2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
@ 2025-09-02 11:11 ` Christian Loehle
  2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
  3 siblings, 0 replies; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 11:11 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

scx_bpf_cpu_rq() works on an unlocked rq which generally isn't safe.
For the common use-cases scx_bpf_cpu_rq_locked() and
scx_bpf_remote_curr() work, so add a deprecation warning
to scx_bpf_cpu_rq() so it can eventually be removed.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index dc141144bfd6..987c7dc38545 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7427,6 +7427,9 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
+	pr_warn_once("%s() is deprecated; use scx_bpf_cpu_rq_locked() when holding rq lock "
+		     "or scx_bpf_remote_curr() to read remote curr safely.\n", __func__);
+
 	return cpu_rq(cpu);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
                   ` (2 preceding siblings ...)
  2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
@ 2025-09-02 11:58 ` Andrea Righi
  2025-09-02 13:53   ` Christian Loehle
  3 siblings, 1 reply; 7+ messages in thread
From: Andrea Righi @ 2025-09-02 11:58 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz, jake

On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> holding the associated rq.
> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
> 
> This also simplifies scx code from:
> 
> rq = scx_bpf_cpu_rq(cpu);
> if (!rq)
> 	return;
> p = rq->curr
> /* ... Do something with p */
> 
> into:
> 
> p = scx_bpf_remote_curr(cpu);
> /* ... Do something with p */

This looks good to me.

We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
BPF schedulers can be updated to use this new kfunc without breaking the
compatibility with older kernels, but we can do this later, I'll send a
follow-up patch. For now:

Acked-by: Andrea Righi <arighi@nvidia.com>

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
@ 2025-09-02 13:53   ` Christian Loehle
  2025-09-02 14:10     ` Andrea Righi
  0 siblings, 1 reply; 7+ messages in thread
From: Christian Loehle @ 2025-09-02 13:53 UTC (permalink / raw)
  To: Andrea Righi
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz, jake

On 9/2/25 12:58, Andrea Righi wrote:
> On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
>> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
>> holding the associated rq.
>> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
>> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
>> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
>> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
>> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
>>
>> This also simplifies scx code from:
>>
>> rq = scx_bpf_cpu_rq(cpu);
>> if (!rq)
>> 	return;
>> p = rq->curr
>> /* ... Do something with p */
>>
>> into:
>>
>> p = scx_bpf_remote_curr(cpu);
>> /* ... Do something with p */
> 
> This looks good to me.
> 
> We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
> BPF schedulers can be updated to use this new kfunc without breaking the
> compatibility with older kernels, but we can do this later, I'll send a
> follow-up patch. For now:
> 
> Acked-by: Andrea Righi <arighi@nvidia.com>

Thanks!
I'd have the compat patch ready as well and would send it out in a bit.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-09-02 13:53   ` Christian Loehle
@ 2025-09-02 14:10     ` Andrea Righi
  0 siblings, 0 replies; 7+ messages in thread
From: Andrea Righi @ 2025-09-02 14:10 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz, jake

On Tue, Sep 02, 2025 at 02:53:56PM +0100, Christian Loehle wrote:
> On 9/2/25 12:58, Andrea Righi wrote:
> > On Tue, Sep 02, 2025 at 12:11:40PM +0100, Christian Loehle wrote:
> >> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> >> holding the associated rq.
> >> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> >> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> >> So provide an alternative scx_bpf_remote_curr() that doesn't expose struct rq
> >> and provide a hardened scx_bpf_cpu_rq_locked() by ensuring we hold the rq lock.
> >> Add a deprecation warning to scx_bpf_cpu_rq() that mentions the two alternatives.
> >>
> >> This also simplifies scx code from:
> >>
> >> rq = scx_bpf_cpu_rq(cpu);
> >> if (!rq)
> >> 	return;
> >> p = rq->curr
> >> /* ... Do something with p */
> >>
> >> into:
> >>
> >> p = scx_bpf_remote_curr(cpu);
> >> /* ... Do something with p */
> > 
> > This looks good to me.
> > 
> > We should probably add a __COMPAT_scx_bpf_remote_curr() macro, so that the
> > BPF schedulers can be updated to use this new kfunc without breaking the
> > compatibility with older kernels, but we can do this later, I'll send a
> > follow-up patch. For now:
> > 
> > Acked-by: Andrea Righi <arighi@nvidia.com>
> 
> Thanks!
> I'd have the compat patch ready as well and would send it out in a bit.

Awesome, I was thinking about something like the following (untested).
Feel free to include this in your patch.

Thanks,
-Andrea

 tools/sched_ext/include/scx/compat.bpf.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/sched_ext/include/scx/compat.bpf.h b/tools/sched_ext/include/scx/compat.bpf.h
index 36e0cd2fd4eda..67594ff99a461 100644
--- a/tools/sched_ext/include/scx/compat.bpf.h
+++ b/tools/sched_ext/include/scx/compat.bpf.h
@@ -230,6 +230,15 @@ static inline bool __COMPAT_is_enq_cpu_selected(u64 enq_flags)
 	 scx_bpf_pick_any_cpu_node(cpus_allowed, node, flags) :			\
 	 scx_bpf_pick_any_cpu(cpus_allowed, flags))
 
+/*
+ * v6.18: Add a helper to retrieve the current task from a runqueue.
+ *
+ * Keep this macro available until v6.20 for compatibility.
+ */
+#define __COMPAT_scx_bpf_remote_curr(cpu)					\
+	(bpf_ksym_exists(scx_bpf_remote_curr) ?					\
+	 scx_bpf_remote_curr(cpu) : scx_bpf_cpu_rq(cpu)->curr)
+
 /*
  * Define sched_ext_ops. This may be expanded to define multiple variants for
  * backward compatibility. See compat.h::SCX_OPS_LOAD/ATTACH().

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-09-02 14:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02 11:11 [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 2/3] sched_ext: Introduce scx_bpf_remote_curr() Christian Loehle
2025-09-02 11:11 ` [PATCH v6 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
2025-09-02 11:58 ` [PATCH v6 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
2025-09-02 13:53   ` Christian Loehle
2025-09-02 14:10     ` Andrea Righi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).