* [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq()
@ 2025-08-04 11:27 Christian Loehle
2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
Christian Loehle
scx_bpf_cpu_rq() currently allows accessing struct rq fields without
holding the associated rq.
It is being used by scx_cosmos, scx_flash, scx_lavd, scx_layered, and
scx_tickless. Fortunately it is only ever used to fetch rq->curr.
So provide an alternative scx_bpf_remote_curr() that doesn't expose
struct rq and harden scx_bpf_cpu_rq() by ensuring we hold the rq lock.
This also simplifies scx code from:
rq = scx_bpf_cpu_rq(cpu);
if (!rq)
return;
p = rq->curr
if (!p)
return;
/* ... Do something with p */
into:
p = scx_bpf_remote_curr(cpu);
if (!p)
return;
/* ... Do something with p */
bpf_task_release(p);
Patch 1 was previously submitted and can be applied independently of
the other two.
https://lore.kernel.org/lkml/43a9cbdc-5121-4dc8-8438-0f01c90a4687@arm.com/
https://lore.kernel.org/lkml/0b8111c6-1b14-41dc-a674-14a6361992b3@arm.com/
v1:
https://lore.kernel.org/lkml/20250801141741.355059-1-christian.loehle@arm.com/
- scx_bpf_cpu_rq() now errors when a not locked rq is requested. (Andrea)
- scx_bpf_remote_curr() calls bpf_task_acquire() which BPF user needs to
release. (Andrea)
Christian Loehle (3):
sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
sched_ext: Provide scx_bpf_remote_curr()
sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
kernel/sched/ext.c | 36 ++++++++++++++++++++++--
tools/sched_ext/include/scx/common.bpf.h | 1 +
2 files changed, 35 insertions(+), 2 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable
2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
2 siblings, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
Christian Loehle, stable
scx_bpf_cpu_rq() obviously returns NULL on invalid cpu.
Mark it as such.
While kf_cpu_valid() will trigger scx_ops_error() that leads
to the BPF scheduler exiting, this isn't guaranteed to be immediate,
allowing for a dereference of a NULL scx_bpf_cpu_rq() return value.
Cc: stable@vger.kernel.org
Fixes: 6203ef73fa5c ("sched/ext: Add BPF function to fetch rq")
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
---
kernel/sched/ext.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 7dedc9a16281..3ea3f0f18030 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7589,7 +7589,7 @@ BTF_ID_FLAGS(func, scx_bpf_get_online_cpumask, KF_ACQUIRE)
BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
-BTF_ID_FLAGS(func, scx_bpf_cpu_rq)
+BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
#ifdef CONFIG_CGROUP_SCHED
BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
2025-08-04 12:51 ` Andrea Righi
2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
2 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
Christian Loehle
Provide scx_bpf_remote_curr() as a way for scx schedulers to
check the curr task of a remote rq, without assuming its lock
is held.
Many scx schedulers make use of scx_bpf_cpu_rq() to check a
remote curr (e.g. to see if it should be preempted). This is
problematic because scx_bpf_cpu_rq() provides access to all
fields of struct rq, most of which aren't safe to use without
holding the associated rq lock.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 24 ++++++++++++++++++++++++
tools/sched_ext/include/scx/common.bpf.h | 1 +
2 files changed, 25 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3ea3f0f18030..1d9d9cbed0aa 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
return cpu_rq(cpu);
}
+struct task_struct *bpf_task_acquire(struct task_struct *p);
+
+/**
+ * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
+ * @cpu: CPU of the rq
+ *
+ * Increments the refcount of the task_struct which needs to be released later.
+ */
+__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
+{
+ struct task_struct *p;
+
+ if (!kf_cpu_valid(cpu, NULL))
+ return NULL;
+
+ rcu_read_lock();
+ p = cpu_rq(cpu)->curr;
+ if (p)
+ p = bpf_task_acquire(p);
+ rcu_read_unlock();
+ return p;
+}
+
/**
* scx_bpf_task_cgroup - Return the sched cgroup of a task
* @p: task of interest
@@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
+BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
#ifdef CONFIG_CGROUP_SCHED
BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
#endif
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index d4e21558e982..e5d4ef124532 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
bool scx_bpf_task_running(const struct task_struct *p) __ksym;
s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
+struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
u64 scx_bpf_now(void) __ksym __weak;
void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
@ 2025-08-04 11:27 ` Christian Loehle
2025-08-04 12:41 ` Andrea Righi
2 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 11:27 UTC (permalink / raw)
To: tj, arighi, void
Cc: linux-kernel, sched-ext, changwoo, hodgesd, mingo, peterz,
Christian Loehle
Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
All upstream scx schedulers can be converted into the new
scx_bpf_remote_curr() instead.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 1d9d9cbed0aa..0b05ddc1f100 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
*/
__bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
{
+ struct rq *rq;
+
if (!kf_cpu_valid(cpu, NULL))
return NULL;
- return cpu_rq(cpu);
+ rq = cpu_rq(cpu);
+ if (rq != scx_locked_rq_state) {
+ scx_kf_error("Accessing not locked rq %d", cpu);
+ return NULL;
+ }
+
+ return rq;
}
struct task_struct *bpf_task_acquire(struct task_struct *p);
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
@ 2025-08-04 12:41 ` Andrea Righi
2025-08-04 13:02 ` Christian Loehle
0 siblings, 1 reply; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 12:41 UTC (permalink / raw)
To: Christian Loehle
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz
On Mon, Aug 04, 2025 at 12:27:43PM +0100, Christian Loehle wrote:
> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
> Furthermore they become meaningless without rq lock, too.
> Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
>
> All upstream scx schedulers can be converted into the new
> scx_bpf_remote_curr() instead.
>
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
> kernel/sched/ext.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 1d9d9cbed0aa..0b05ddc1f100 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
> */
> __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> {
> + struct rq *rq;
> +
> if (!kf_cpu_valid(cpu, NULL))
> return NULL;
>
> - return cpu_rq(cpu);
> + rq = cpu_rq(cpu);
> + if (rq != scx_locked_rq_state) {
I think you want to check rq != scx_locked_rq(), since scx_locked_rq_state
is a per-CPU variable.
We may also want to add a preempt_disable/enable() for consistency. How
about something like this?
preempt_disable();
rq = cpu_rq(cpu);
if (rq != scx_locked_rq()) {
scx_kf_error("Accessing CPU%d rq from CPU%d without holding its lock",
cpu, smp_processor_id());
rq = NULL;
}
preempt_enable();
Thanks,
-Andrea
> + scx_kf_error("Accessing not locked rq %d", cpu);
> + return NULL;
> + }
> +
> + return rq;
> }
>
> struct task_struct *bpf_task_acquire(struct task_struct *p);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
@ 2025-08-04 12:51 ` Andrea Righi
2025-08-04 13:27 ` Christian Loehle
0 siblings, 1 reply; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 12:51 UTC (permalink / raw)
To: Christian Loehle
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz
On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
> Provide scx_bpf_remote_curr() as a way for scx schedulers to
> check the curr task of a remote rq, without assuming its lock
> is held.
>
> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
> remote curr (e.g. to see if it should be preempted). This is
> problematic because scx_bpf_cpu_rq() provides access to all
> fields of struct rq, most of which aren't safe to use without
> holding the associated rq lock.
>
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
> kernel/sched/ext.c | 24 ++++++++++++++++++++++++
> tools/sched_ext/include/scx/common.bpf.h | 1 +
> 2 files changed, 25 insertions(+)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 3ea3f0f18030..1d9d9cbed0aa 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> return cpu_rq(cpu);
> }
>
> +struct task_struct *bpf_task_acquire(struct task_struct *p);
Can we move include <linux/btf.h> all the way to the top? In this way we
don't have to add this forward declaration.
> +
> +/**
> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
> + * @cpu: CPU of the rq
> + *
> + * Increments the refcount of the task_struct which needs to be released later.
Maybe we should mention that the task must be released by calling
bpf_task_release().
While at it, what do you think about renaming this to something like
scx_bpf_task_acquire_on_cpu(), so that it looks similar to
bpf_task_acquire()?
> + */
> +__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
> +{
> + struct task_struct *p;
> +
> + if (!kf_cpu_valid(cpu, NULL))
> + return NULL;
> +
> + rcu_read_lock();
> + p = cpu_rq(cpu)->curr;
> + if (p)
> + p = bpf_task_acquire(p);
> + rcu_read_unlock();
> + return p;
> +}
> +
> /**
> * scx_bpf_task_cgroup - Return the sched cgroup of a task
> * @p: task of interest
> @@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
> BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
> +BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
> #ifdef CONFIG_CGROUP_SCHED
> BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
> #endif
> diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
> index d4e21558e982..e5d4ef124532 100644
> --- a/tools/sched_ext/include/scx/common.bpf.h
> +++ b/tools/sched_ext/include/scx/common.bpf.h
> @@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
> bool scx_bpf_task_running(const struct task_struct *p) __ksym;
> s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
> struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
> +struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
> struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
> u64 scx_bpf_now(void) __ksym __weak;
> void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
> --
> 2.34.1
>
Thanks,
-Andrea
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
2025-08-04 12:41 ` Andrea Righi
@ 2025-08-04 13:02 ` Christian Loehle
0 siblings, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 13:02 UTC (permalink / raw)
To: Andrea Righi
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz
On 8/4/25 13:41, Andrea Righi wrote:
> On Mon, Aug 04, 2025 at 12:27:43PM +0100, Christian Loehle wrote:
>> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
>> Furthermore they become meaningless without rq lock, too.
>> Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
>>
>> All upstream scx schedulers can be converted into the new
>> scx_bpf_remote_curr() instead.
>>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>> kernel/sched/ext.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 1d9d9cbed0aa..0b05ddc1f100 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -7420,10 +7420,18 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>> */
>> __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>> {
>> + struct rq *rq;
>> +
>> if (!kf_cpu_valid(cpu, NULL))
>> return NULL;
>>
>> - return cpu_rq(cpu);
>> + rq = cpu_rq(cpu);
>> + if (rq != scx_locked_rq_state) {
>
> I think you want to check rq != scx_locked_rq(), since scx_locked_rq_state
> is a per-CPU variable.
Duh, of course. m(
>
> We may also want to add a preempt_disable/enable() for consistency. How
> about something like this?
>
> preempt_disable();
> rq = cpu_rq(cpu);
> if (rq != scx_locked_rq()) {
> scx_kf_error("Accessing CPU%d rq from CPU%d without holding its lock",
> cpu, smp_processor_id());
> rq = NULL;
> }
> preempt_enable();
Ack
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
2025-08-04 12:51 ` Andrea Righi
@ 2025-08-04 13:27 ` Christian Loehle
2025-08-04 15:48 ` Andrea Righi
0 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-08-04 13:27 UTC (permalink / raw)
To: Andrea Righi
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz
On 8/4/25 13:51, Andrea Righi wrote:
> On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
>> Provide scx_bpf_remote_curr() as a way for scx schedulers to
>> check the curr task of a remote rq, without assuming its lock
>> is held.
>>
>> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
>> remote curr (e.g. to see if it should be preempted). This is
>> problematic because scx_bpf_cpu_rq() provides access to all
>> fields of struct rq, most of which aren't safe to use without
>> holding the associated rq lock.
>>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>> kernel/sched/ext.c | 24 ++++++++++++++++++++++++
>> tools/sched_ext/include/scx/common.bpf.h | 1 +
>> 2 files changed, 25 insertions(+)
>>
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 3ea3f0f18030..1d9d9cbed0aa 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>> return cpu_rq(cpu);
>> }
>>
>> +struct task_struct *bpf_task_acquire(struct task_struct *p);
>
> Can we move include <linux/btf.h> all the way to the top? In this way we
> don't have to add this forward declaration.
>
At least currently <linux/btf.h> doesn't declare bpf_task_acquire.
I'm not quite sure what's most elegant here, none of the bpf helper
kfuncs seem to be used from kernel code / kernel/sched/ext.c
>> +
>> +/**
>> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
>> + * @cpu: CPU of the rq
>> + *
>> + * Increments the refcount of the task_struct which needs to be released later.
>
> Maybe we should mention that the task must be released by calling
> bpf_task_release().
Sure.
>
> While at it, what do you think about renaming this to something like
> scx_bpf_task_acquire_on_cpu(), so that it looks similar to
> bpf_task_acquire()?
Will change it to
bpf_task_acquire_remote_curr()?
..on_cpu() seems like a bit of a jump semantically.
>
>> + */
>> +__bpf_kfunc struct task_struct *scx_bpf_remote_curr(s32 cpu)
>> +{
>> + struct task_struct *p;
>> +
>> + if (!kf_cpu_valid(cpu, NULL))
>> + return NULL;
>> +
>> + rcu_read_lock();
>> + p = cpu_rq(cpu)->curr;
>> + if (p)
>> + p = bpf_task_acquire(p);
>> + rcu_read_unlock();
>> + return p;
>> +}
>> +
>> /**
>> * scx_bpf_task_cgroup - Return the sched cgroup of a task
>> * @p: task of interest
>> @@ -7590,6 +7613,7 @@ BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE)
>> BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
>> BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
>> BTF_ID_FLAGS(func, scx_bpf_cpu_rq, KF_RET_NULL)
>> +BTF_ID_FLAGS(func, scx_bpf_remote_curr, KF_RET_NULL | KF_ACQUIRE)
>> #ifdef CONFIG_CGROUP_SCHED
>> BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_RCU | KF_ACQUIRE)
>> #endif
>> diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
>> index d4e21558e982..e5d4ef124532 100644
>> --- a/tools/sched_ext/include/scx/common.bpf.h
>> +++ b/tools/sched_ext/include/scx/common.bpf.h
>> @@ -91,6 +91,7 @@ s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
>> bool scx_bpf_task_running(const struct task_struct *p) __ksym;
>> s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
>> struct rq *scx_bpf_cpu_rq(s32 cpu) __ksym;
>> +struct task_struct *scx_bpf_remote_curr(s32 cpu) __ksym;
>> struct cgroup *scx_bpf_task_cgroup(struct task_struct *p) __ksym __weak;
>> u64 scx_bpf_now(void) __ksym __weak;
>> void scx_bpf_events(struct scx_event_stats *events, size_t events__sz) __ksym __weak;
>> --
>> 2.34.1
>>
>
> Thanks,
> -Andrea
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr()
2025-08-04 13:27 ` Christian Loehle
@ 2025-08-04 15:48 ` Andrea Righi
0 siblings, 0 replies; 9+ messages in thread
From: Andrea Righi @ 2025-08-04 15:48 UTC (permalink / raw)
To: Christian Loehle
Cc: tj, void, linux-kernel, sched-ext, changwoo, hodgesd, mingo,
peterz
On Mon, Aug 04, 2025 at 02:27:14PM +0100, Christian Loehle wrote:
> On 8/4/25 13:51, Andrea Righi wrote:
> > On Mon, Aug 04, 2025 at 12:27:42PM +0100, Christian Loehle wrote:
> >> Provide scx_bpf_remote_curr() as a way for scx schedulers to
> >> check the curr task of a remote rq, without assuming its lock
> >> is held.
> >>
> >> Many scx schedulers make use of scx_bpf_cpu_rq() to check a
> >> remote curr (e.g. to see if it should be preempted). This is
> >> problematic because scx_bpf_cpu_rq() provides access to all
> >> fields of struct rq, most of which aren't safe to use without
> >> holding the associated rq lock.
> >>
> >> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> >> ---
> >> kernel/sched/ext.c | 24 ++++++++++++++++++++++++
> >> tools/sched_ext/include/scx/common.bpf.h | 1 +
> >> 2 files changed, 25 insertions(+)
> >>
> >> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> >> index 3ea3f0f18030..1d9d9cbed0aa 100644
> >> --- a/kernel/sched/ext.c
> >> +++ b/kernel/sched/ext.c
> >> @@ -7426,6 +7426,29 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> >> return cpu_rq(cpu);
> >> }
> >>
> >> +struct task_struct *bpf_task_acquire(struct task_struct *p);
> >
> > Can we move include <linux/btf.h> all the way to the top? In this way we
> > don't have to add this forward declaration.
> >
>
> At least currently <linux/btf.h> doesn't declare bpf_task_acquire.
> I'm not quite sure what's most elegant here, none of the bpf helper
> kfuncs seem to be used from kernel code / kernel/sched/ext.c
Oh I see, it's a kfunc, so its prototype is only available in BPF.
Maybe we can use `if (refcount_inc_not_zero(&p->rcu_users))` directly,
instead of bpf_task_acquire()?
>
>
> >> +
> >> +/**
> >> + * scx_bpf_remote_curr - Fetch the curr of a rq without acquiring its rq lock
> >> + * @cpu: CPU of the rq
> >> + *
> >> + * Increments the refcount of the task_struct which needs to be released later.
> >
> > Maybe we should mention that the task must be released by calling
> > bpf_task_release().
>
> Sure.
>
> >
> > While at it, what do you think about renaming this to something like
> > scx_bpf_task_acquire_on_cpu(), so that it looks similar to
> > bpf_task_acquire()?
>
> Will change it to
> bpf_task_acquire_remote_curr()?
> ..on_cpu() seems like a bit of a jump semantically.
Ack.
-Andrea
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-04 15:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 11:27 [PATCH v2 0/3] sched_ext: Harden scx_bpf_cpu_rq() Christian Loehle
2025-08-04 11:27 ` [PATCH v2 1/3] sched_ext: Mark scx_bpf_cpu_rq as NULL returnable Christian Loehle
2025-08-04 11:27 ` [PATCH v2 2/3] sched_ext: Provide scx_bpf_remote_curr() Christian Loehle
2025-08-04 12:51 ` Andrea Righi
2025-08-04 13:27 ` Christian Loehle
2025-08-04 15:48 ` Andrea Righi
2025-08-04 11:27 ` [PATCH v2 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq() Christian Loehle
2025-08-04 12:41 ` Andrea Righi
2025-08-04 13:02 ` Christian Loehle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).