linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-08-05 11:10 Christian Loehle
  2025-08-05 11:10 ` [PATCH v3 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Christian Loehle @ 2025-08-05 11:10 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_task_acquire_remote_curr() that
doesn't expose struct rq and harden scx_bpf_cpu_rq() by ensuring we
hold the rq lock.

This also simplifies scx code from:

rq = scx_bpf_cpu_rq(cpu);
if (!rq)
	return;
p = rq->curr
if (!p)
	return;
/* ... Do something with p */

into:

p = scx_bpf_tas_acquire_remote_curr(cpu);
if (!p)
	return;
/* ... Do something with p */
bpf_task_release(p);


Patch 1 was previously submitted and can be applied independently of
the other two.
https://lore.kernel.org/lkml/43a9cbdc-5121-4dc8-8438-0f01c90a4687@arm.com/
https://lore.kernel.org/lkml/0b8111c6-1b14-41dc-a674-14a6361992b3@arm.com/

v2:
https://lore.kernel.org/lkml/20250804112743.711816-1-christian.loehle@arm.com/
- Open-code bpf_task_acquire() to avoid the forward declaration (Andrea)
- Rename scx_bpf_task_acquire_remote_curr() to make it more explicit it
behaves like bpf_task_acquire()
- Dis
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)

Christian Loehle (3):
  sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
  sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()

 kernel/sched/ext.c                       | 38 ++++++++++++++++++++++--
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 37 insertions(+), 2 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
  2025-08-05 11:10 [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-05 11:10 ` Christian Loehle
  2025-08-05 11:10 ` [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Christian Loehle @ 2025-08-05 11:10 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle, stable

scx_bpf_cpu_rq() obviously returns NULL on invalid cpu.
Mark it as such.
While kf_cpu_valid() will trigger scx_ops_error() that leads
to the BPF scheduler exiting, this isn't guaranteed to be immediate,
allowing for a dereference of a NULL scx_bpf_cpu_rq() return value.

Cc: stable@vger.kernel.org
Fixes: 6203ef73fa5c ("sched/ext: Add BPF function to fetch rq")
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 7dedc9a16281..3ea3f0f18030 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7589,7 +7589,7 @@ BTF_ID_FLAGS(func, scx_bpf_get_online_cpumask, KF_ACQUIRE)
 BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
-BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  2025-08-05 11:10 [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-05 11:10 ` [PATCH v3 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
@ 2025-08-05 11:10 ` Christian Loehle
  2025-08-09 19:01   ` Tejun Heo
  2025-08-05 11:10 ` [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
  2025-08-05 14:33 ` [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
  3 siblings, 1 reply; 14+ messages in thread
From: Christian Loehle @ 2025-08-05 11:10 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

Provide scx_bpf_task_acquire_remote_curr() as a way for scx schedulers
to check the curr task of a remote rq without assuming its lock is
held.

Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
(e.g. to see if it should be preempted). This is problematic because
scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
which aren't safe to use without holding the associated rq lock.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3ea3f0f18030..3e2fa0b1eb57 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	return cpu_rq(cpu);
 }
 
+/**
+ * scx_bpf_task_acquire_remote_curr - Fetch the curr task of a rq without
+ * acquiring its rq lock
+ * @cpu: CPU of the rq
+ *
+ * Increments the refcount of the task_struct which needs to be released using
+ * bpf_task_release().
+ */
+__bpf_kfunc struct task_struct *scx_bpf_task_acquire_remote_curr(s32 cpu)
+{
+	struct task_struct *p;
+
+	if (!kf_cpu_valid(cpu, NULL))
+		return NULL;
+
+	rcu_read_lock();
+	p = cpu_rq(cpu)->curr;
+	if (p)
+		p = refcount_inc_not_zero(&p->rcu_users) ? p : NULL;
+	rcu_read_unlock();
+	return p;
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_task_acquire_remote_curr, KF_RET_NULL | KF_ACQUIRE)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..bdd68f3100b7 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct task_struct *scx_bpf_task_acquire_remote_curr(s32 cpu) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-05 11:10 [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-05 11:10 ` [PATCH v3 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
  2025-08-05 11:10 ` [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
@ 2025-08-05 11:10 ` Christian Loehle
  2025-08-09 19:03   ` Tejun Heo
  2025-08-05 14:33 ` [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi
  3 siblings, 1 reply; 14+ messages in thread
From: Christian Loehle @ 2025-08-05 11:10 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.

All upstream scx schedulers can be converted into the new
scx_bpf_task_acquire_remote_curr() instead.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3e2fa0b1eb57..a66cf654f33e 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
  */
 __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 {
+	struct rq *rq;
+
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
-	return cpu_rq(cpu);
+	preempt_disable();
+	rq = cpu_rq(cpu);
+	if (rq != scx_locked_rq()) {
+		scx_kf_error("Accessing not locked rq %d", cpu);
+		rq = NULL;
+	}
+	preempt_enable();
+
+	return rq;
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq()
  2025-08-05 11:10 [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
                   ` (2 preceding siblings ...)
  2025-08-05 11:10 ` [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-05 14:33 ` Andrea Righi
  3 siblings, 0 replies; 14+ messages in thread
From: Andrea Righi @ 2025-08-05 14:33 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Tue, Aug 05, 2025 at 12:10:33PM +0100, Christian Loehle wrote:
> scx_bpf_cpu_rq() currently allows accessing struct rq fields without
> holding the associated rq.
> It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
> scx_tickless. Fortunately it is only ever used to fetch rq->curr.
> So provide an alternative scx_bpf_task_acquire_remote_curr() that
> doesn't expose struct rq and harden scx_bpf_cpu_rq() by ensuring we
> hold the rq lock.
> 
> This also simplifies scx code from:
> 
> rq = scx_bpf_cpu_rq(cpu);
> if (!rq)
> 	return;
> p = rq->curr
> if (!p)
> 	return;
> /* ... Do something with p */
> 
> into:
> 
> p = scx_bpf_tas_acquire_remote_curr(cpu);
> if (!p)
> 	return;
> /* ... Do something with p */
> bpf_task_release(p);

Looks good to me.

Reviewed-by: Andrea Righi <arighi@nvidia.com>

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  2025-08-05 11:10 ` [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
@ 2025-08-09 19:01   ` Tejun Heo
  2025-08-11 17:08     ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2025-08-09 19:01 UTC (permalink / raw)
  To: Christian Loehle
  Cc: arighi, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Tue, Aug 05, 2025 at 12:10:35PM +0100, Christian Loehle wrote:
> Provide scx_bpf_task_acquire_remote_curr() as a way for scx schedulers
> to check the curr task of a remote rq without assuming its lock is
> held.
> 
> Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
> (e.g. to see if it should be preempted). This is problematic because
> scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
> which aren't safe to use without holding the associated rq lock.
> 
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>

Applied 1-2 to sched_ext/for-6.17-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-05 11:10 ` [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-09 19:03   ` Tejun Heo
  2025-08-10 10:52     ` Andrea Righi
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2025-08-09 19:03 UTC (permalink / raw)
  To: Christian Loehle
  Cc: arighi, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

Hello,

On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
> @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>   */
>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>  {
> +	struct rq *rq;
> +
>  	if (!kf_cpu_valid(cpu, NULL))
>  		return NULL;
>  
> -	return cpu_rq(cpu);
> +	preempt_disable();
> +	rq = cpu_rq(cpu);
> +	if (rq != scx_locked_rq()) {
> +		scx_kf_error("Accessing not locked rq %d", cpu);
> +		rq = NULL;
> +	}
> +	preempt_enable();

So, this will break the existing scheduler binaries immediately, which I
don't think is a good way to go about it. Can you add a pr_warn_once() to
print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-09 19:03   ` Tejun Heo
@ 2025-08-10 10:52     ` Andrea Righi
  2025-08-10 22:18       ` Christian Loehle
  2025-08-11 14:35       ` Jake Hillion
  0 siblings, 2 replies; 14+ messages in thread
From: Andrea Righi @ 2025-08-10 10:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christian Loehle, void, linux-kernel, sched-ext, changwoo,
	hodgesd, mingo, peterz

On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
> > @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
> >   */
> >  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> >  {
> > +	struct rq *rq;
> > +
> >  	if (!kf_cpu_valid(cpu, NULL))
> >  		return NULL;
> >  
> > -	return cpu_rq(cpu);
> > +	preempt_disable();
> > +	rq = cpu_rq(cpu);
> > +	if (rq != scx_locked_rq()) {
> > +		scx_kf_error("Accessing not locked rq %d", cpu);
> > +		rq = NULL;
> > +	}
> > +	preempt_enable();
> 
> So, this will break the existing scheduler binaries immediately, which I
> don't think is a good way to go about it. Can you add a pr_warn_once() to
> print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?

Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
we force schedulers to check for NULL and, if they don't, the verifier
won't be happy, so this already breaks existing binaries.

Even if a scheduler performs the NULL check, this change might still cause
incorrect behaviors, which can be worse than triggering an error.

How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
scx_bpf_task_acquire_remote_curr()?

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-10 10:52     ` Andrea Righi
@ 2025-08-10 22:18       ` Christian Loehle
  2025-08-11 16:52         ` Tejun Heo
  2025-08-11 14:35       ` Jake Hillion
  1 sibling, 1 reply; 14+ messages in thread
From: Christian Loehle @ 2025-08-10 22:18 UTC (permalink / raw)
  To: Andrea Righi, Tejun Heo
  Cc: void, linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz

On 8/10/25 11:52, Andrea Righi wrote:
> On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
>>> @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>>>   */
>>>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>>>  {
>>> +	struct rq *rq;
>>> +
>>>  	if (!kf_cpu_valid(cpu, NULL))
>>>  		return NULL;
>>>  
>>> -	return cpu_rq(cpu);
>>> +	preempt_disable();
>>> +	rq = cpu_rq(cpu);
>>> +	if (rq != scx_locked_rq()) {
>>> +		scx_kf_error("Accessing not locked rq %d", cpu);
>>> +		rq = NULL;
>>> +	}
>>> +	preempt_enable();
>>
>> So, this will break the existing scheduler binaries immediately, which I
>> don't think is a good way to go about it. Can you add a pr_warn_once() to
>> print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?
> 
> Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> we force schedulers to check for NULL and, if they don't, the verifier
> won't be happy, so this already breaks existing binaries.
> 
> Even if a scheduler performs the NULL check, this change might still cause
> incorrect behaviors, which can be worse than triggering an error.
> 
> How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
> error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
> scx_bpf_task_acquire_remote_curr()?

If we still trigger an error in scx_bpf_cpu_rq() what's the difference
between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then?
Just the error message?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-10 10:52     ` Andrea Righi
  2025-08-10 22:18       ` Christian Loehle
@ 2025-08-11 14:35       ` Jake Hillion
  2025-08-11 16:51         ` Tejun Heo
  2025-08-12 13:31         ` Andrea Righi
  1 sibling, 2 replies; 14+ messages in thread
From: Jake Hillion @ 2025-08-11 14:35 UTC (permalink / raw)
  To: Andrea Righi
  Cc: Tejun Heo, Christian Loehle, void, linux-kernel, sched-ext,
	changwoo, hodgesd, mingo, peterz

On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> we force schedulers to check for NULL and, if they don't, the verifier
> won't be happy, so this already breaks existing binaries.

I ran some testing on the sched_ext for-next branch, and scx_cosmos is
breaking in cosmos_init including the latest changes. I believe it kicks
off a timer in init, which indirectly calls
`scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
checked, but old binaries breaking is pretty inconvenient for new users.

As Andrea says, this is the already merged patch triggering this.

Thanks,
Jake.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-11 14:35       ` Jake Hillion
@ 2025-08-11 16:51         ` Tejun Heo
  2025-08-12 13:31         ` Andrea Righi
  1 sibling, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2025-08-11 16:51 UTC (permalink / raw)
  To: Jake Hillion
  Cc: Andrea Righi, Christian Loehle, void, linux-kernel, sched-ext,
	changwoo, hodgesd, mingo, peterz

On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> 
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
> 
> As Andrea says, this is the already merged patch triggering this.

Lemme revert that. I don't think we should introduce breaking changes
without grace period.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-10 22:18       ` Christian Loehle
@ 2025-08-11 16:52         ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2025-08-11 16:52 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Andrea Righi, void, linux-kernel, sched-ext, changwoo, hodgesd,
	mingo, peterz

Hello,

On Sun, Aug 10, 2025 at 11:18:52PM +0100, Christian Loehle wrote:
...
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> > 
> > Even if a scheduler performs the NULL check, this change might still cause
> > incorrect behaviors, which can be worse than triggering an error.

I'll revert that change. We shouldn't be introducing breaking changes
without grace period.

> > How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
> > error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
> > scx_bpf_task_acquire_remote_curr()?
> 
> If we still trigger an error in scx_bpf_cpu_rq() what's the difference
> between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then?
> Just the error message?

Let's add a warning to scx_bpf_cpu_rq() pointing to scx_bpf_cpu_rq_locked(),
update the schedulers, and drop scx_bpf_cpu_rq() in a couple releases.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr()
  2025-08-09 19:01   ` Tejun Heo
@ 2025-08-11 17:08     ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2025-08-11 17:08 UTC (permalink / raw)
  To: Christian Loehle
  Cc: arighi, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Sat, Aug 09, 2025 at 09:01:18AM -1000, Tejun Heo wrote:
> On Tue, Aug 05, 2025 at 12:10:35PM +0100, Christian Loehle wrote:
> > Provide scx_bpf_task_acquire_remote_curr() as a way for scx schedulers
> > to check the curr task of a remote rq without assuming its lock is
> > held.
> > 
> > Many scx schedulers make use of scx_bpf_cpu_rq() to check a remote curr
> > (e.g. to see if it should be preempted). This is problematic because
> > scx_bpf_cpu_rq() provides access to all fields of struct rq, most of
> > which aren't safe to use without holding the associated rq lock.
> > 
> > Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> 
> Applied 1-2 to sched_ext/for-6.17-fixes.

Reverted due to compatibility issues. Let's try again with warnings as
discussed in the other subthread.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-11 14:35       ` Jake Hillion
  2025-08-11 16:51         ` Tejun Heo
@ 2025-08-12 13:31         ` Andrea Righi
  1 sibling, 0 replies; 14+ messages in thread
From: Andrea Righi @ 2025-08-12 13:31 UTC (permalink / raw)
  To: Jake Hillion
  Cc: Tejun Heo, Christian Loehle, void, linux-kernel, sched-ext,
	changwoo, hodgesd, mingo, peterz

On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> 
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
> 
> As Andrea says, this is the already merged patch triggering this.

We should provide a compat helper in common.bpf.h and fix the schedulers to
use this helper. Something like the following (untested):

static inline struct task_struct *
__COMPAT_scx_bpf_task_acquire_remote_curr(s32 cpu)
{
	struct rq *rq;

	if (bpf_ksym_exists(scx_bpf_task_acquire_remote_curr)
		return scx_bpf_task_acquire_remote_curr(cpu);

	rq = scx_bpf_cpu_rq(cpu);

	return rq ? rq->curr : NULL;
}

Then we can drop this after a couple of kernel releases (like in v6.20).

-Andrea

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-08-12 13:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 11:10 [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-05 11:10 ` [PATCH v3 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
2025-08-05 11:10 ` [PATCH v3 2/3] sched_ext: Provide scx_bpf_task_acquire_remote_curr() Christian Loehle
2025-08-09 19:01   ` Tejun Heo
2025-08-11 17:08     ` Tejun Heo
2025-08-05 11:10 ` [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
2025-08-09 19:03   ` Tejun Heo
2025-08-10 10:52     ` Andrea Righi
2025-08-10 22:18       ` Christian Loehle
2025-08-11 16:52         ` Tejun Heo
2025-08-11 14:35       ` Jake Hillion
2025-08-11 16:51         ` Tejun Heo
2025-08-12 13:31         ` Andrea Righi
2025-08-05 14:33 ` [PATCH v3 0/3] sched_ext: Harden scx_bpf_cpu_rq() Andrea Righi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).