All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-08-11 21:21 Christian Loehle
  2025-08-11 21:21 ` [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Christian Loehle @ 2025-08-11 21:21 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_task_acquire_remote_curr() that
doesn't expose struct rq and provide a hardened scx_bpf_cpu_rq_locked()
by ensuring we hold the rq lock.
Add a deprecation warning to scx_bpf_cpu_rq_locked() that mentions the
two alternatives.

This also simplifies scx code from:

rq = scx_bpf_cpu_rq(cpu);
if (!rq)
	return;
p = rq->curr
if (!p)
	return;
/* ... Do something with p */

into:

p = scx_bpf_task_acquire_remote_curr(cpu);
if (!p)
	return;
/* ... Do something with p */
bpf_task_release(p);


v3:
https://lore.kernel.org/lkml/20250805111036.130121-1-christian.loehle@arm.com/
Don't change scx_bpf_cpu_rq() do not break BPF schedulers without the
grace period. Just add the deprecation warning and do the hardening in
the new scx_bpf_cpu_rq_locked(). (Andrea, Tejun, Jake)
v2:
https://lore.kernel.org/lkml/20250804112743.711816-1-christian.loehle@arm.com/
- Open-code bpf_task_acquire() to avoid the forward declaration (Andrea)
- Rename scx_bpf_task_acquire_remote_curr() to make it more explicit it
behaves like bpf_task_acquire()
- Dis
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)
Christian Loehle (3):
  sched_ext: Introduce scx_bpf_cpu_rq_locked()
  sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  sched_ext: deprecation warn for scx_bpf_cpu_rq()

 kernel/sched/ext.c                       | 49 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  2 +
 2 files changed, 51 insertions(+)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked()
  2025-08-11 21:21 [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-11 21:21 ` Christian Loehle
  2025-08-11 23:38   ` Tejun Heo
  2025-08-11 21:21 ` [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Christian Loehle @ 2025-08-11 21:21 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Make a safer version of scx_bpf_cpu_rq() that only returns a rq
if we hold rq lock of that rq.

Also mark the new scx_bpf_cpu_rq_locked() as returning NULL.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 22 ++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 23 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 7dedc9a16281..14706c36ca83 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7426,6 +7426,27 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	return cpu_rq(cpu);
 }
 
+/**
+ * scx_bpf_cpu_rq_locked - Fetch the locked rq of a CPU
+ * @cpu: CPU of the rq
+ */
+__bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(s32 cpu)
+{
+	struct rq *rq;
+
+	if (!kf_cpu_valid(cpu, NULL))
+		return NULL;
+
+	preempt_disable();
+	rq = cpu_rq(cpu);
+	if (rq != scx_locked_rq()) {
+		scx_kf_error("Accessing not locked rq %d", cpu);
+		rq = NULL;
+	}
+	preempt_enable();
+	return rq;
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7590,6 +7611,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..7451491347ed 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct rq *scx_bpf_cpu_rq_locked(s32 cpu) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  2025-08-11 21:21 [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-11 21:21 ` [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
@ 2025-08-11 21:21 ` Christian Loehle
  2025-08-17  4:07   ` kernel test robot
  2025-08-11 21:21 ` [PATCH v4 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
  2025-08-12  8:00 ` [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Peter Zijlstra
  3 siblings, 1 reply; 11+ messages in thread
From: Christian Loehle @ 2025-08-11 21:21 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

Provide scx_bpf_task_acquire_remote_curr() as a way for scx schedulers
to check the curr task of a remote rq without assuming its lock is
held.

Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
(e.g. to see if it should be preempted). This is problematic because
scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
which aren't safe to use without holding the associated rq lock.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 14706c36ca83..ded4ace36090 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7447,6 +7447,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(s32 cpu)
 	return rq;
 }
 
+/**
+ * scx_bpf_task_acquire_remote_curr - Fetch the curr task of a rq without
+ * acquiring its rq lock
+ * @cpu: CPU of the rq
+ *
+ * Increments the refcount of the task_struct which needs to be released using
+ * bpf_task_release().
+ */
+__bpf_kfunc struct task_struct *scx_bpf_task_acquire_remote_curr(s32 cpu)
+{
+	struct task_struct *p;
+
+	if (!kf_cpu_valid(cpu, NULL))
+		return NULL;
+
+	rcu_read_lock();
+	p = cpu_rq(cpu)->curr;
+	if (p)
+		p = refcount_inc_not_zero(&p->rcu_users) ? p : NULL;
+	rcu_read_unlock();
+	return p;
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7612,6 +7635,7 @@ BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq_locked, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_task_acquire_remote_curr, KF_RET_NULL | KF_ACQUIRE)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index 7451491347ed..2105cd0ee178 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -92,6 +92,7 @@ bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
 struct rq *scx_bpf_cpu_rq_locked(s32 cpu) __ksym;
+struct task_struct *scx_bpf_task_acquire_remote_curr(s32 cpu) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq()
  2025-08-11 21:21 [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-11 21:21 ` [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
  2025-08-11 21:21 ` [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
@ 2025-08-11 21:21 ` Christian Loehle
  2025-08-12  8:00 ` [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Peter Zijlstra
  3 siblings, 0 replies; 11+ messages in thread
From: Christian Loehle @ 2025-08-11 21:21 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

scx_bpf_cpu_rq() works on an unlocked rq which generally isn't safe.
For the common use-cases scx_bpf_cpu_rq_locked() and
scx_bpf_task_acquire_remote_curr() work, so add a deprecation warning
to scx_bpf_cpu_rq() so it can eventually be removed.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index ded4ace36090..7d2d88e8dd59 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7423,6 +7423,9 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
+	pr_warn_once("%s() is deprecated in favor of scx_bpf_cpu_rq_locked() or "
+		     "scx_bpf_task_acquire_remote_curr() for unlocked remote curr\n",
+		     __func__);
 	return cpu_rq(cpu);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked()
  2025-08-11 21:21 ` [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
@ 2025-08-11 23:38   ` Tejun Heo
  2025-08-12  9:07     ` Christian Loehle
  0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2025-08-11 23:38 UTC (permalink / raw)
  To: Christian Loehle
  Cc: arighi, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz, jake

Hello,

On Mon, Aug 11, 2025 at 10:21:48PM +0100, Christian Loehle wrote:
> +/**
> + * scx_bpf_cpu_rq_locked - Fetch the locked rq of a CPU
> + * @cpu: CPU of the rq
> + */
> +__bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(s32 cpu)
> +{
> +	struct rq *rq;
> +
> +	if (!kf_cpu_valid(cpu, NULL))
> +		return NULL;
> +
> +	preempt_disable();
> +	rq = cpu_rq(cpu);
> +	if (rq != scx_locked_rq()) {
> +		scx_kf_error("Accessing not locked rq %d", cpu);
> +		rq = NULL;
> +	}
> +	preempt_enable();
> +	return rq;
> +}

Do we need @cpu? What do you think about making the function not take any
arguments and just return the locked rq?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-08-11 21:21 [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
                   ` (2 preceding siblings ...)
  2025-08-11 21:21 ` [PATCH v4 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-12  8:00 ` Peter Zijlstra
  2025-08-12 11:40   ` Christian Loehle
  3 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-08-12  8:00 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, arighi, void, linux-kernel, sched-ext, changwoo, hodgesd,
	mingo, jake

On Mon, Aug 11, 2025 at 10:21:47PM +0100, Christian Loehle wrote:
> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> holding the associated rq.
> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> So provide an alternative scx_bpf_task_acquire_remote_curr() that
> doesn't expose struct rq and provide a hardened scx_bpf_cpu_rq_locked()
> by ensuring we hold the rq lock.
> Add a deprecation warning to scx_bpf_cpu_rq_locked() that mentions the
> two alternatives.
> 
> This also simplifies scx code from:
> 
> rq = scx_bpf_cpu_rq(cpu);
> if (!rq)
> 	return;
> p = rq->curr
> if (!p)
> 	return;
> /* ... Do something with p */
> 
> into:
> 
> p = scx_bpf_task_acquire_remote_curr(cpu);
> if (!p)
> 	return;
> /* ... Do something with p */
> bpf_task_release(p);

Why do that mandatory refcount dance, rather than directly exposing the
RCU-ness of that pointer? IIRC BPF was good with RCU.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked()
  2025-08-11 23:38   ` Tejun Heo
@ 2025-08-12  9:07     ` Christian Loehle
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Loehle @ 2025-08-12  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: arighi, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz, jake

On 8/12/25 00:38, Tejun Heo wrote:
> Hello,
> 
> On Mon, Aug 11, 2025 at 10:21:48PM +0100, Christian Loehle wrote:
>> +/**
>> + * scx_bpf_cpu_rq_locked - Fetch the locked rq of a CPU
>> + * @cpu: CPU of the rq
>> + */
>> +__bpf_kfunc struct rq *scx_bpf_cpu_rq_locked(s32 cpu)
>> +{
>> +	struct rq *rq;
>> +
>> +	if (!kf_cpu_valid(cpu, NULL))
>> +		return NULL;
>> +
>> +	preempt_disable();
>> +	rq = cpu_rq(cpu);
>> +	if (rq != scx_locked_rq()) {
>> +		scx_kf_error("Accessing not locked rq %d", cpu);
>> +		rq = NULL;
>> +	}
>> +	preempt_enable();
>> +	return rq;
>> +}
> 
> Do we need @cpu? What do you think about making the function not take any
> arguments and just return the locked rq?

Indeed now that this no longer has to be a drop-in replacement.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-08-12  8:00 ` [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Peter Zijlstra
@ 2025-08-12 11:40   ` Christian Loehle
  2025-08-12 13:36     ` Andrea Righi
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Loehle @ 2025-08-12 11:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tj, arighi, void, linux-kernel, sched-ext, changwoo, hodgesd,
	mingo, jake

On 8/12/25 09:00, Peter Zijlstra wrote:
> On Mon, Aug 11, 2025 at 10:21:47PM +0100, Christian Loehle wrote:
>> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
>> holding the associated rq.
>> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
>> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
>> So provide an alternative scx_bpf_task_acquire_remote_curr() that
>> doesn't expose struct rq and provide a hardened scx_bpf_cpu_rq_locked()
>> by ensuring we hold the rq lock.
>> Add a deprecation warning to scx_bpf_cpu_rq_locked() that mentions the
>> two alternatives.
>>
>> This also simplifies scx code from:
>>
>> rq = scx_bpf_cpu_rq(cpu);
>> if (!rq)
>> 	return;
>> p = rq->curr
>> if (!p)
>> 	return;
>> /* ... Do something with p */
>>
>> into:
>>
>> p = scx_bpf_task_acquire_remote_curr(cpu);
>> if (!p)
>> 	return;
>> /* ... Do something with p */
>> bpf_task_release(p);
> 
> Why do that mandatory refcount dance, rather than directly exposing the
> RCU-ness of that pointer? IIRC BPF was good with RCU.

Just because that's how
bpf_task_from_pid()
bpf_task_from_vpid()
already work. I have no strong preference either way.
Apart from the above just returning
rcu_dereference(cpu_rq(cpu)->curr);
is obviously a bit less cumbersome (and yes, RCUs are exposed to BPF,
for scx most callbacks have that implicit anyway.)

I'll change it to scx_bpf_remote_curr() that does that in the next
version, thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-08-12 11:40   ` Christian Loehle
@ 2025-08-12 13:36     ` Andrea Righi
  0 siblings, 0 replies; 11+ messages in thread
From: Andrea Righi @ 2025-08-12 13:36 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Peter Zijlstra, tj, void, linux-kernel, sched-ext, changwoo,
	hodgesd, mingo, jake

On Tue, Aug 12, 2025 at 12:40:39PM +0100, Christian Loehle wrote:
> On 8/12/25 09:00, Peter Zijlstra wrote:
> > On Mon, Aug 11, 2025 at 10:21:47PM +0100, Christian Loehle wrote:
> >> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> >> holding the associated rq.
> >> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> >> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> >> So provide an alternative scx_bpf_task_acquire_remote_curr() that
> >> doesn't expose struct rq and provide a hardened scx_bpf_cpu_rq_locked()
> >> by ensuring we hold the rq lock.
> >> Add a deprecation warning to scx_bpf_cpu_rq_locked() that mentions the
> >> two alternatives.
> >>
> >> This also simplifies scx code from:
> >>
> >> rq = scx_bpf_cpu_rq(cpu);
> >> if (!rq)
> >> 	return;
> >> p = rq->curr
> >> if (!p)
> >> 	return;
> >> /* ... Do something with p */
> >>
> >> into:
> >>
> >> p = scx_bpf_task_acquire_remote_curr(cpu);
> >> if (!p)
> >> 	return;
> >> /* ... Do something with p */
> >> bpf_task_release(p);
> > 
> > Why do that mandatory refcount dance, rather than directly exposing the
> > RCU-ness of that pointer? IIRC BPF was good with RCU.
> 
> Just because that's how
> bpf_task_from_pid()
> bpf_task_from_vpid()
> already work. I have no strong preference either way.
> Apart from the above just returning
> rcu_dereference(cpu_rq(cpu)->curr);
> is obviously a bit less cumbersome (and yes, RCUs are exposed to BPF,
> for scx most callbacks have that implicit anyway.)
> 
> I'll change it to scx_bpf_remote_curr() that does that in the next
> version, thanks!

Yeah, I suggested Christian to do the refcount dance to be consistent with
bpf_task_from_[v]pid(), but we can probably mark the kfunc as KF_RCU and
rely on RCU locking to save a bit of overhead. So, it's probably better to
follow Peter's suggestion.

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  2025-08-11 21:21 ` [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
@ 2025-08-17  4:07   ` kernel test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2025-08-17  4:07 UTC (permalink / raw)
  To: Christian Loehle; +Cc: oe-kbuild-all

Hi Christian,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/sched/core]
[also build test WARNING on linus/master v6.17-rc1 next-20250815]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Christian-Loehle/sched_ext-Introduce-scx_bpf_cpu_rq_locked/20250812-052423
base:   tip/sched/core
patch link:    https://lore.kernel.org/r/20250811212150.85759-3-christian.loehle%40arm.com
patch subject: [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
config: sparc64-randconfig-r121-20250817 (https://download.01.org/0day-ci/archive/20250817/202508171134.vxF7uU1z-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 8.5.0
reproduce: (https://download.01.org/0day-ci/archive/20250817/202508171134.vxF7uU1z-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508171134.vxF7uU1z-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/sched/ext.c:3718:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:3718:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:3718:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:3797:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:3797:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:3797:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:3850:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:3850:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:3850:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:3878:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:3878:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:3878:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:3891:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:3891:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:3891:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4021:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4021:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4021:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4037:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4037:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4037:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4102:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4102:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4102:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4131:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4131:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4131:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4148:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4148:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4148:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4201:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4201:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4201:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4225:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4225:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4225:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:4245:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root @@
   kernel/sched/ext.c:4245:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:4245:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [toplevel] scx_root
   kernel/sched/ext.c:5157:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root @@
   kernel/sched/ext.c:5157:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:5157:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root
   kernel/sched/ext.c:5199:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root @@
   kernel/sched/ext.c:5199:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:5199:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root
   kernel/sched/ext.c:5310:52: sparse: sparse: incorrect type in argument 3 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/ext.c:5310:52: sparse:     expected struct task_struct *p
   kernel/sched/ext.c:5310:52: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/ext.c:6050:32: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct const *p @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/ext.c:6050:32: sparse:     expected struct task_struct const *p
   kernel/sched/ext.c:6050:32: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/ext.c:6159:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root @@
   kernel/sched/ext.c:6159:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:6159:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root
   kernel/sched/ext.c:6431:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root @@
   kernel/sched/ext.c:6431:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:6431:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root
   kernel/sched/ext.c:6564:33: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct scx_sched *sch @@     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root @@
   kernel/sched/ext.c:6564:33: sparse:     expected struct scx_sched *sch
   kernel/sched/ext.c:6564:33: sparse:     got struct scx_sched [noderef] __rcu *static [addressable] [assigned] [toplevel] scx_root
   kernel/sched/ext.c:7411:33: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/ext.c:7411:33: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/ext.c:7411:33: sparse:    struct task_struct const *
>> kernel/sched/ext.c:7472:11: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/build_policy.c: note: in included file:
   kernel/sched/ext_idle.c:863:5: sparse: sparse: symbol 'select_cpu_from_kfunc' was not declared. Should it be static?
   kernel/sched/build_policy.c: note: in included file:
   kernel/sched/syscalls.c:206:22: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/syscalls.c:206:22: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/syscalls.c:206:22: sparse:    struct task_struct *
   kernel/sched/build_policy.c: note: in included file:
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2262:26: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2262:26: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2262:26: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2251:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2251:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2262:26: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct *
   kernel/sched/sched.h:2447:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2447:9: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2447:9: sparse:    struct task_struct *
   kernel/sched/sched.h:2262:26: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2262:26: sparse:    struct task_struct *
   kernel/sched/sched.h:2447:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2447:9: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2447:9: sparse:    struct task_struct *
   kernel/sched/build_policy.c: note: in included file:
   kernel/sched/syscalls.c:1289:6: sparse: sparse: context imbalance in 'sched_getaffinity' - different lock contexts for basic block
   kernel/sched/build_policy.c: note: in included file:
   kernel/sched/rt.c:1659:15: sparse: sparse: dereference of noderef expression

vim +7472 kernel/sched/ext.c

  7455	
  7456	/**
  7457	 * scx_bpf_task_acquire_remote_curr - Fetch the curr task of a rq without
  7458	 * acquiring its rq lock
  7459	 * @cpu: CPU of the rq
  7460	 *
  7461	 * Increments the refcount of the task_struct which needs to be released using
  7462	 * bpf_task_release().
  7463	 */
  7464	__bpf_kfunc struct task_struct *scx_bpf_task_acquire_remote_curr(s32 cpu)
  7465	{
  7466		struct task_struct *p;
  7467	
  7468		if (!kf_cpu_valid(cpu, NULL))
  7469			return NULL;
  7470	
  7471		rcu_read_lock();
> 7472		p = cpu_rq(cpu)->curr;
  7473		if (p)
  7474			p = refcount_inc_not_zero(&p->rcu_users) ? p : NULL;
  7475		rcu_read_unlock();
  7476		return p;
  7477	}
  7478	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-09-01 13:26 Christian Loehle
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Loehle @ 2025-09-01 13:26 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz, jake,
	Christian Loehle

scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_task_acquire_remote_curr() that
doesn't expose struct rq and provide a hardened scx_bpf_cpu_rq_locked()
by ensuring we hold the rq lock.
Add a deprecation warning to scx_bpf_cpu_rq_locked() that mentions the
two alternatives.

This also simplifies scx code from:

rq = scx_bpf_cpu_rq(cpu);
if (!rq)
	return;
p = rq->curr
if (!p)
	return;
/* ... Do something with p */

into:

p = scx_bpf_task_acquire_remote_curr(cpu);
if (!p)
	return;
/* ... Do something with p */
bpf_task_release(p);


v3:
https://lore.kernel.org/lkml/20250805111036.130121-1-christian.loehle@arm.com/
Don't change scx_bpf_cpu_rq() do not break BPF schedulers without the
grace period. Just add the deprecation warning and do the hardening in
the new scx_bpf_cpu_rq_locked(). (Andrea, Tejun, Jake)
v2:
https://lore.kernel.org/lkml/20250804112743.711816-1-christian.loehle@arm.com/
- Open-code bpf_task_acquire() to avoid the forward declaration (Andrea)
- Rename scx_bpf_task_acquire_remote_curr() to make it more explicit it
behaves like bpf_task_acquire()
- Dis
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)
Christian Loehle (3):
  sched_ext: Introduce scx_bpf_cpu_rq_locked()
  sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  sched_ext: deprecation warn for scx_bpf_cpu_rq()

 kernel/sched/ext.c                       | 49 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  2 +
 2 files changed, 51 insertions(+)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-09-01 13:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 21:21 [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-11 21:21 ` [PATCH v4 1/3] sched_ext: Introduce scx_bpf_cpu_rq_locked() Christian Loehle
2025-08-11 23:38   ` Tejun Heo
2025-08-12  9:07     ` Christian Loehle
2025-08-11 21:21 ` [PATCH v4 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
2025-08-17  4:07   ` kernel test robot
2025-08-11 21:21 ` [PATCH v4 3/3] sched_ext: deprecation warn for scx_bpf_cpu_rq() Christian Loehle
2025-08-12  8:00 ` [PATCH v4 0/3] sched_ext: Harden scx_bpf_cpu_rq() Peter Zijlstra
2025-08-12 11:40   ` Christian Loehle
2025-08-12 13:36     ` Andrea Righi
  -- strict thread matches above, loose matches on Subject: below --
2025-09-01 13:26 Christian Loehle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.