[PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/3]  sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-08-04 11:27 Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_remote_curr() that doesn't expose
struct rq and harden scx_bpf_cpu_rq() by ensuring we hold the rq lock.

This also simplifies scx code from:

rq = scx_bpf_cpu_rq(cpu);
if (!rq)
	return;
p = rq->curr
if (!p)
	return;
/* ... Do something with p */

into:

p = scx_bpf_remote_curr(cpu);
if (!p)
	return;
/* ... Do something with p */
bpf_task_release(p);

Patch 1 was previously submitted and can be applied independently of
the other two.
https://lore.kernel.org/lkml/43a9cbdc-5121-4dc8-8438-0f01c90a4687@arm.com/
https://lore.kernel.org/lkml/0b8111c6-1b14-41dc-a674-14a6361992b3@arm.com/

v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)

Christian Loehle (3):
  sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
  sched_ext: Provide scx_bpf_remote_curr()
  sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()

 kernel/sched/ext.c                       | 36 ++++++++++++++++++++++--
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 35 insertions(+), 2 deletions(-)

--
2.34.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
  2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
  2 siblings, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle, stable

scx_bpf_cpu_rq() obviously returns NULL on invalid cpu.
Mark it as such.
While kf_cpu_valid() will trigger scx_ops_error() that leads
to the BPF scheduler exiting, this isn't guaranteed to be immediate,
allowing for a dereference of a NULL scx_bpf_cpu_rq() return value.

Cc: stable@vger.kernel.org
Fixes: 6203ef73fa5c ("sched/ext: Add BPF function to fetch rq")
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 7dedc9a16281..3ea3f0f18030 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7589,7 +7589,7 @@ BTF_ID_FLAGS(func, scx_bpf_get_online_cpumask, KF_ACQUIRE)
 BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
-BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
  2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
  2025-08-04 12:51   ` Andrea Righi
  2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
  2 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

Provide scx_bpf_remote_curr() as a way for scx schedulers to
check the curr task of a remote rq, without assuming its lock
is held.

Many scx schedulers make use of scx_bpf_cpu_rq() to check a
remote curr (e.g. to see if it should be preempted). This is
problematic because scx_bpf_cpu_rq() provides access to all
fields of struct rq, most of which aren't safe to use without
holding the associated rq lock.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3ea3f0f18030..1d9d9cbed0aa 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	return cpu_rq(cpu);
 }

+struct task_struct *bpf_task_acquire(struct task_struct *p);
+
+/**
+ * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
+ * @cpu: CPU of the rq
+ *
+ * Increments the refcount of the task_struct which needs to be released later.
+ */
+__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
+{
+	struct task_struct *p;
+
+	if (!kf_cpu_valid(cpu, NULL))
+		return NULL;
+
+	rcu_read_lock();
+	p = cpu_rq(cpu)->curr;
+	if (p)
+		p = bpf_task_acquire(p);
+	rcu_read_unlock();
+	return p;
+}
+
 /**
  * scx_bpf_task_cgroup - Return the sched cgroup of a task
  * @p: task of interest
@@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
 #ifdef CONFIG_CGROUP_SCHED
 BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
 #endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..e5d4ef124532 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
 struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
 struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
 u64 scx_bpf_now(void) __ksym __weak;
 void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
--
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
  2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
  2025-08-04 12:41   ` Andrea Righi
  2 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
  To: tj, arighi, void
  Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
	Christian Loehle

Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.

All upstream scx schedulers can be converted into the new
scx_bpf_remote_curr() instead.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 1d9d9cbed0aa..0b05ddc1f100 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
  */
 __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 {
+	struct rq *rq;
+
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
-	return cpu_rq(cpu);
+	rq = cpu_rq(cpu);
+	if (rq != scx_locked_rq_state) {
+		scx_kf_error("Accessing not locked rq %d", cpu);
+		return NULL;
+	}
+
+	return rq;
 }
 
 struct task_struct *bpf_task_acquire(struct task_struct *p);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-04 12:41   ` Andrea Righi
  2025-08-04 13:02     ` Christian Loehle
  0 siblings, 1 reply; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 12:41 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Mon, Aug 04, 2025 at 12:27:43PM +0100, Christian Loehle wrote:
> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
> Furthermore they become meaningless without rq lock, too.
> Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
> 
> All upstream scx schedulers can be converted into the new
> scx_bpf_remote_curr() instead.
> 
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
>  kernel/sched/ext.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 1d9d9cbed0aa..0b05ddc1f100 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>   */
>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>  {
> +	struct rq *rq;
> +
>  	if (!kf_cpu_valid(cpu, NULL))
>  		return NULL;
>  
> -	return cpu_rq(cpu);
> +	rq = cpu_rq(cpu);
> +	if (rq != scx_locked_rq_state) {

I think you want to check rq != scx_locked_rq(), since scx_locked_rq_state
is a per-CPU variable.

We may also want to add a preempt_disable/enable() for consistency. How
about something like this?

	preempt_disable();
	rq = cpu_rq(cpu);
	if (rq != scx_locked_rq()) {
		scx_kf_error("Accessing CPU%d rq from CPU%d without holding its lock",
			     cpu, smp_processor_id());
		rq = NULL;
	}
	preempt_enable();

Thanks,
-Andrea

> +		scx_kf_error("Accessing not locked rq %d", cpu);
> +		return NULL;
> +	}
> +
> +	return rq;
>  }
>  
>  struct task_struct *bpf_task_acquire(struct task_struct *p);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
  2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
@ 2025-08-04 12:51   ` Andrea Righi
  2025-08-04 13:27     ` Christian Loehle
  0 siblings, 1 reply; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 12:51 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
> Provide scx_bpf_remote_curr() as a way for scx schedulers to
> check the curr task of a remote rq, without assuming its lock
> is held.
> 
> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
> remote curr (e.g. to see if it should be preempted). This is
> problematic because scx_bpf_cpu_rq() provides access to all
> fields of struct rq, most of which aren't safe to use without
> holding the associated rq lock.
> 
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
>  kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
>  tools/sched_ext/include/scx/common.bpf.h |  1 +
>  2 files changed, 25 insertions(+)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 3ea3f0f18030..1d9d9cbed0aa 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>  	return cpu_rq(cpu);
>  }
> 
> +struct task_struct *bpf_task_acquire(struct task_struct *p);

Can we move include <linux/btf.h> all the way to the top? In this way we
don't have to add this forward declaration.

> +
> +/**
> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
> + * @cpu: CPU of the rq
> + *
> + * Increments the refcount of the task_struct which needs to be released later.

Maybe we should mention that the task must be released by calling
bpf_task_release().

While at it, what do you think about renaming this to something like
scx_bpf_task_acquire_on_cpu(), so that it looks similar to
bpf_task_acquire()?

> + */
> +__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
> +{
> +	struct task_struct *p;
> +
> +	if (!kf_cpu_valid(cpu, NULL))
> +		return NULL;
> +
> +	rcu_read_lock();
> +	p = cpu_rq(cpu)->curr;
> +	if (p)
> +		p = bpf_task_acquire(p);
> +	rcu_read_unlock();
> +	return p;
> +}
> +
>  /**
>   * scx_bpf_task_cgroup - Return the sched cgroup of a task
>   * @p: task of interest
> @@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
>  BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
> +BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
>  #ifdef CONFIG_CGROUP_SCHED
>  BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
>  #endif
> diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
> index d4e21558e982..e5d4ef124532 100644
> --- a/tools/sched_ext/include/scx/common.bpf.h
> +++ b/tools/sched_ext/include/scx/common.bpf.h
> @@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
>  bool scx_bpf_task_running(const struct task_struct *p) __ksym;
>  s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
>  struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
> +struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
>  struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
>  u64 scx_bpf_now(void) __ksym __weak;
>  void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
> --
> 2.34.1
> 

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
  2025-08-04 12:41   ` Andrea Righi
@ 2025-08-04 13:02     ` Christian Loehle
  0 siblings, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 13:02 UTC (permalink / raw)
  To: Andrea Righi
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On 8/4/25 13:41, Andrea Righi wrote:
> On Mon, Aug 04, 2025 at 12:27:43PM +0100, Christian Loehle wrote:
>> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
>> Furthermore they become meaningless without rq lock, too.
>> Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
>>
>> All upstream scx schedulers can be converted into the new
>> scx_bpf_remote_curr() instead.
>>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>>  kernel/sched/ext.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 1d9d9cbed0aa..0b05ddc1f100 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>>   */
>>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>>  {
>> +	struct rq *rq;
>> +
>>  	if (!kf_cpu_valid(cpu, NULL))
>>  		return NULL;
>>  
>> -	return cpu_rq(cpu);
>> +	rq = cpu_rq(cpu);
>> +	if (rq != scx_locked_rq_state) {
> 
> I think you want to check rq != scx_locked_rq(), since scx_locked_rq_state
> is a per-CPU variable.

Duh, of course. m(

> 
> We may also want to add a preempt_disable/enable() for consistency. How
> about something like this?
> 
> 	preempt_disable();
> 	rq = cpu_rq(cpu);
> 	if (rq != scx_locked_rq()) {
> 		scx_kf_error("Accessing CPU%d rq from CPU%d without holding its lock",
> 			     cpu, smp_processor_id());
> 		rq = NULL;
> 	}
> 	preempt_enable();

Ack


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
  2025-08-04 12:51   ` Andrea Righi
@ 2025-08-04 13:27     ` Christian Loehle
  2025-08-04 15:48       ` Andrea Righi
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 13:27 UTC (permalink / raw)
  To: Andrea Righi
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On 8/4/25 13:51, Andrea Righi wrote:
> On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
>> Provide scx_bpf_remote_curr() as a way for scx schedulers to
>> check the curr task of a remote rq, without assuming its lock
>> is held.
>>
>> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
>> remote curr (e.g. to see if it should be preempted). This is
>> problematic because scx_bpf_cpu_rq() provides access to all
>> fields of struct rq, most of which aren't safe to use without
>> holding the associated rq lock.
>>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>>  kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
>>  tools/sched_ext/include/scx/common.bpf.h |  1 +
>>  2 files changed, 25 insertions(+)
>>
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 3ea3f0f18030..1d9d9cbed0aa 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>>  	return cpu_rq(cpu);
>>  }
>>
>> +struct task_struct *bpf_task_acquire(struct task_struct *p);
> 
> Can we move include <linux/btf.h> all the way to the top? In this way we
> don't have to add this forward declaration.
> 

At least currently <linux/btf.h> doesn't declare bpf_task_acquire.
I'm not quite sure what's most elegant here, none of the bpf helper
kfuncs seem to be used from kernel code / kernel/sched/ext.c


>> +
>> +/**
>> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
>> + * @cpu: CPU of the rq
>> + *
>> + * Increments the refcount of the task_struct which needs to be released later.
> 
> Maybe we should mention that the task must be released by calling
> bpf_task_release().

Sure.

> 
> While at it, what do you think about renaming this to something like
> scx_bpf_task_acquire_on_cpu(), so that it looks similar to
> bpf_task_acquire()?

Will change it to
bpf_task_acquire_remote_curr()?
..on_cpu() seems like a bit of a jump semantically.

> 
>> + */
>> +__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
>> +{
>> +	struct task_struct *p;
>> +
>> +	if (!kf_cpu_valid(cpu, NULL))
>> +		return NULL;
>> +
>> +	rcu_read_lock();
>> +	p = cpu_rq(cpu)->curr;
>> +	if (p)
>> +		p = bpf_task_acquire(p);
>> +	rcu_read_unlock();
>> +	return p;
>> +}
>> +
>>  /**
>>   * scx_bpf_task_cgroup - Return the sched cgroup of a task
>>   * @p: task of interest
>> @@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
>>  BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
>>  BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
>>  BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
>> +BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
>>  #ifdef CONFIG_CGROUP_SCHED
>>  BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
>>  #endif
>> diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
>> index d4e21558e982..e5d4ef124532 100644
>> --- a/tools/sched_ext/include/scx/common.bpf.h
>> +++ b/tools/sched_ext/include/scx/common.bpf.h
>> @@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
>>  bool scx_bpf_task_running(const struct task_struct *p) __ksym;
>>  s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
>>  struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
>> +struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
>>  struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
>>  u64 scx_bpf_now(void) __ksym __weak;
>>  void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
>> --
>> 2.34.1
>>
> 
> Thanks,
> -Andrea


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
  2025-08-04 13:27     ` Christian Loehle
@ 2025-08-04 15:48       ` Andrea Righi
  0 siblings, 0 replies; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 15:48 UTC (permalink / raw)
  To: Christian Loehle
  Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
	peterz

On Mon, Aug 04, 2025 at 02:27:14PM +0100, Christian Loehle wrote:
> On 8/4/25 13:51, Andrea Righi wrote:
> > On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
> >> Provide scx_bpf_remote_curr() as a way for scx schedulers to
> >> check the curr task of a remote rq, without assuming its lock
> >> is held.
> >>
> >> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
> >> remote curr (e.g. to see if it should be preempted). This is
> >> problematic because scx_bpf_cpu_rq() provides access to all
> >> fields of struct rq, most of which aren't safe to use without
> >> holding the associated rq lock.
> >>
> >> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> >> ---
> >>  kernel/sched/ext.c                       | 24 ++++++++++++++++++++++++
> >>  tools/sched_ext/include/scx/common.bpf.h |  1 +
> >>  2 files changed, 25 insertions(+)
> >>
> >> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> >> index 3ea3f0f18030..1d9d9cbed0aa 100644
> >> --- a/kernel/sched/ext.c
> >> +++ b/kernel/sched/ext.c
> >> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> >>  	return cpu_rq(cpu);
> >>  }
> >>
> >> +struct task_struct *bpf_task_acquire(struct task_struct *p);
> > 
> > Can we move include <linux/btf.h> all the way to the top? In this way we
> > don't have to add this forward declaration.
> > 
> 
> At least currently <linux/btf.h> doesn't declare bpf_task_acquire.
> I'm not quite sure what's most elegant here, none of the bpf helper
> kfuncs seem to be used from kernel code / kernel/sched/ext.c

Oh I see, it's a kfunc, so its prototype is only available in BPF.

Maybe we can use `if (refcount_inc_not_zero(&p->rcu_users))` directly,
instead of bpf_task_acquire()?

> 
> 
> >> +
> >> +/**
> >> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
> >> + * @cpu: CPU of the rq
> >> + *
> >> + * Increments the refcount of the task_struct which needs to be released later.
> > 
> > Maybe we should mention that the task must be released by calling
> > bpf_task_release().
> 
> Sure.
> 
> > 
> > While at it, what do you think about renaming this to something like
> > scx_bpf_task_acquire_on_cpu(), so that it looks similar to
> > bpf_task_acquire()?
> 
> Will change it to
> bpf_task_acquire_remote_curr()?
> ..on_cpu() seems like a bit of a jump semantically.

Ack.

-Andrea

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-04 15:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
2025-08-04 12:51   ` Andrea Righi
2025-08-04 13:27     ` Christian Loehle
2025-08-04 15:48       ` Andrea Righi
2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
2025-08-04 12:41   ` Andrea Righi
2025-08-04 13:02     ` Christian Loehle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).