All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks
@ 2025-04-23 21:02 Andrea Righi
  2025-04-23 23:06 ` Changwoo Min
  2025-04-23 23:35 ` Tejun Heo
  0 siblings, 2 replies; 5+ messages in thread
From: Andrea Righi @ 2025-04-23 21:02 UTC (permalink / raw)
  To: Tejun Heo, David Vernet, Changwoo Min; +Cc: Jake Hillion, linux-kernel

The ops.running() and ops.stopping() callbacks can be invoked from a CPU
other than the one the task is assigned to, particularly when a task
property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
run on CPUs different from the task's target CPU.

This behavior can lead to confusion or incorrect assumptions if not
properly clarified, potentially resulting in bugs (see [1]).

Therefore, update the documentation to clarify this aspect and advise
users to use scx_bpf_task_cpu() to determine the actual CPU the task
will run on or was running on.

[1] https://github.com/sched-ext/scx/pull/1728

Cc: Jake Hillion <jake@hillion.co.uk>
Cc: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Changes in v2:
 - clarify the scenario a bit more in the code comments
 - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index ac79067dc87e6..a83232a032aa4 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -368,6 +368,15 @@ struct sched_ext_ops {
 	 * @running: A task is starting to run on its associated CPU
 	 * @p: task starting to run
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task is going to run on. This can happen when a task
+	 * property is changed (i.e., affinity), since scx_next_task_scx(),
+	 * which triggers this callback, may run on a CPU different from
+	 * the task's assigned CPU.
+	 *
+	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
+	 * target CPU the task is going to use.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers.
 	 */
 	void (*running)(struct task_struct *p);
@@ -377,6 +386,15 @@ struct sched_ext_ops {
 	 * @p: task stopping to run
 	 * @runnable: is task @p still runnable?
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task was running on. This can happen when a task
+	 * property is changed (i.e., affinity), since dequeue_task_scx(),
+	 * which triggers this callback, may run on a CPU different from
+	 * the task's assigned CPU.
+	 *
+	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
+	 * the task was running on.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers. If
 	 * !@runnable, ->quiescent() will be invoked after this operation
 	 * returns.
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks
  2025-04-23 21:02 [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks Andrea Righi
@ 2025-04-23 23:06 ` Changwoo Min
  2025-04-24  5:26   ` Andrea Righi
  2025-04-23 23:35 ` Tejun Heo
  1 sibling, 1 reply; 5+ messages in thread
From: Changwoo Min @ 2025-04-23 23:06 UTC (permalink / raw)
  To: Andrea Righi, Tejun Heo, David Vernet; +Cc: Jake Hillion, linux-kernel

Hi Andrea,

On 4/24/25 06:02, Andrea Righi wrote:
> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> other than the one the task is assigned to, particularly when a task
> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> run on CPUs different from the task's target CPU.

The same goes to ops.quiescent() too since ops.quiescent() is also
called from dequeue_task_scx().

Reviewed-by: Changwoo Min <changwoo@igalia.com>

Regards,
Changwoo Min

> 
> This behavior can lead to confusion or incorrect assumptions if not
> properly clarified, potentially resulting in bugs (see [1]).
> 
> Therefore, update the documentation to clarify this aspect and advise
> users to use scx_bpf_task_cpu() to determine the actual CPU the task
> will run on or was running on.
> 
> [1] https://github.com/sched-ext/scx/pull/1728
> 
> Cc: Jake Hillion <jake@hillion.co.uk>
> Cc: Changwoo Min <changwoo@igalia.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
> ---
>   kernel/sched/ext.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> Changes in v2:
>   - clarify the scenario a bit more in the code comments
>   - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index ac79067dc87e6..a83232a032aa4 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -368,6 +368,15 @@ struct sched_ext_ops {
>   	 * @running: A task is starting to run on its associated CPU
>   	 * @p: task starting to run
>   	 *
> +	 * Note that this callback may be called from a CPU other than the
> +	 * one the task is going to run on. This can happen when a task
> +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
> +	 * which triggers this callback, may run on a CPU different from
> +	 * the task's assigned CPU.
> +	 *
> +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
> +	 * target CPU the task is going to use.
> +	 *
>   	 * See ->runnable() for explanation on the task state notifiers.
>   	 */
>   	void (*running)(struct task_struct *p);
> @@ -377,6 +386,15 @@ struct sched_ext_ops {
>   	 * @p: task stopping to run
>   	 * @runnable: is task @p still runnable?
>   	 *
> +	 * Note that this callback may be called from a CPU other than the
> +	 * one the task was running on. This can happen when a task
> +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
> +	 * which triggers this callback, may run on a CPU different from
> +	 * the task's assigned CPU.
> +	 *
> +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
> +	 * the task was running on.
> +	 *
>   	 * See ->runnable() for explanation on the task state notifiers. If
>   	 * !@runnable, ->quiescent() will be invoked after this operation
>   	 * returns.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks
  2025-04-23 21:02 [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks Andrea Righi
  2025-04-23 23:06 ` Changwoo Min
@ 2025-04-23 23:35 ` Tejun Heo
  1 sibling, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2025-04-23 23:35 UTC (permalink / raw)
  To: Andrea Righi; +Cc: David Vernet, Changwoo Min, Jake Hillion, linux-kernel

On Wed, Apr 23, 2025 at 11:02:05PM +0200, Andrea Righi wrote:
> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> other than the one the task is assigned to, particularly when a task
> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> run on CPUs different from the task's target CPU.
> 
> This behavior can lead to confusion or incorrect assumptions if not
> properly clarified, potentially resulting in bugs (see [1]).
> 
> Therefore, update the documentation to clarify this aspect and advise
> users to use scx_bpf_task_cpu() to determine the actual CPU the task
> will run on or was running on.
> 
> [1] https://github.com/sched-ext/scx/pull/1728
> 
> Cc: Jake Hillion <jake@hillion.co.uk>
> Cc: Changwoo Min <changwoo@igalia.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Applied to sched_ext/for-6.16.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks
  2025-04-23 23:06 ` Changwoo Min
@ 2025-04-24  5:26   ` Andrea Righi
  2025-04-25  4:29     ` Changwoo Min
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2025-04-24  5:26 UTC (permalink / raw)
  To: Changwoo Min; +Cc: Tejun Heo, David Vernet, Jake Hillion, linux-kernel

Hi Changwoo,

On Thu, Apr 24, 2025 at 08:06:47AM +0900, Changwoo Min wrote:
> Hi Andrea,
> 
> On 4/24/25 06:02, Andrea Righi wrote:
> > The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> > other than the one the task is assigned to, particularly when a task
> > property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> > run on CPUs different from the task's target CPU.
> 
> The same goes to ops.quiescent() too since ops.quiescent() is also
> called from dequeue_task_scx().

Yeah, I was a bit conflicted about mentioning this for ops.runnable() and
ops.quiescent() as well, since it's more obvious in those cases that
they're executed outside the context of the "current CPU", since the task
isn't running on any CPU yet, or it's no longer running. In the end, I
decided to update only ops.running() and ops.stopping(), where it's less
clear that the task's CPU may not match the current CPU.

Thanks for taking a look!
-Andrea

> 
> Reviewed-by: Changwoo Min <changwoo@igalia.com>
> 
> Regards,
> Changwoo Min
> 
> > 
> > This behavior can lead to confusion or incorrect assumptions if not
> > properly clarified, potentially resulting in bugs (see [1]).
> > 
> > Therefore, update the documentation to clarify this aspect and advise
> > users to use scx_bpf_task_cpu() to determine the actual CPU the task
> > will run on or was running on.
> > 
> > [1] https://github.com/sched-ext/scx/pull/1728
> > 
> > Cc: Jake Hillion <jake@hillion.co.uk>
> > Cc: Changwoo Min <changwoo@igalia.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> >   kernel/sched/ext.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> > 
> > Changes in v2:
> >   - clarify the scenario a bit more in the code comments
> >   - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
> > 
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index ac79067dc87e6..a83232a032aa4 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -368,6 +368,15 @@ struct sched_ext_ops {
> >   	 * @running: A task is starting to run on its associated CPU
> >   	 * @p: task starting to run
> >   	 *
> > +	 * Note that this callback may be called from a CPU other than the
> > +	 * one the task is going to run on. This can happen when a task
> > +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
> > +	 * which triggers this callback, may run on a CPU different from
> > +	 * the task's assigned CPU.
> > +	 *
> > +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
> > +	 * target CPU the task is going to use.
> > +	 *
> >   	 * See ->runnable() for explanation on the task state notifiers.
> >   	 */
> >   	void (*running)(struct task_struct *p);
> > @@ -377,6 +386,15 @@ struct sched_ext_ops {
> >   	 * @p: task stopping to run
> >   	 * @runnable: is task @p still runnable?
> >   	 *
> > +	 * Note that this callback may be called from a CPU other than the
> > +	 * one the task was running on. This can happen when a task
> > +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
> > +	 * which triggers this callback, may run on a CPU different from
> > +	 * the task's assigned CPU.
> > +	 *
> > +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
> > +	 * the task was running on.
> > +	 *
> >   	 * See ->runnable() for explanation on the task state notifiers. If
> >   	 * !@runnable, ->quiescent() will be invoked after this operation
> >   	 * returns.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks
  2025-04-24  5:26   ` Andrea Righi
@ 2025-04-25  4:29     ` Changwoo Min
  0 siblings, 0 replies; 5+ messages in thread
From: Changwoo Min @ 2025-04-25  4:29 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Tejun Heo, David Vernet, Jake Hillion, linux-kernel

Hi Andrea,

On 4/24/25 14:26, Andrea Righi wrote:
> Hi Changwoo,
> 
> On Thu, Apr 24, 2025 at 08:06:47AM +0900, Changwoo Min wrote:
>> Hi Andrea,
>>
>> On 4/24/25 06:02, Andrea Righi wrote:
>>> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
>>> other than the one the task is assigned to, particularly when a task
>>> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
>>> run on CPUs different from the task's target CPU.
>>
>> The same goes to ops.quiescent() too since ops.quiescent() is also
>> called from dequeue_task_scx().
> 
> Yeah, I was a bit conflicted about mentioning this for ops.runnable() and
> ops.quiescent() as well, since it's more obvious in those cases that
> they're executed outside the context of the "current CPU", since the task
> isn't running on any CPU yet, or it's no longer running. In the end, I
> decided to update only ops.running() and ops.stopping(), where it's less
> clear that the task's CPU may not match the current CPU.

That makes sense. Thanks for the clarification!

-- Changwoo

> 
> Thanks for taking a look!
> -Andrea
> 
>>
>> Reviewed-by: Changwoo Min <changwoo@igalia.com>
>>
>> Regards,
>> Changwoo Min
>>
>>>
>>> This behavior can lead to confusion or incorrect assumptions if not
>>> properly clarified, potentially resulting in bugs (see [1]).
>>>
>>> Therefore, update the documentation to clarify this aspect and advise
>>> users to use scx_bpf_task_cpu() to determine the actual CPU the task
>>> will run on or was running on.
>>>
>>> [1] https://github.com/sched-ext/scx/pull/1728
>>>
>>> Cc: Jake Hillion <jake@hillion.co.uk>
>>> Cc: Changwoo Min <changwoo@igalia.com>
>>> Signed-off-by: Andrea Righi <arighi@nvidia.com>
>>> ---
>>>    kernel/sched/ext.c | 18 ++++++++++++++++++
>>>    1 file changed, 18 insertions(+)
>>>
>>> Changes in v2:
>>>    - clarify the scenario a bit more in the code comments
>>>    - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
>>>
>>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>>> index ac79067dc87e6..a83232a032aa4 100644
>>> --- a/kernel/sched/ext.c
>>> +++ b/kernel/sched/ext.c
>>> @@ -368,6 +368,15 @@ struct sched_ext_ops {
>>>    	 * @running: A task is starting to run on its associated CPU
>>>    	 * @p: task starting to run
>>>    	 *
>>> +	 * Note that this callback may be called from a CPU other than the
>>> +	 * one the task is going to run on. This can happen when a task
>>> +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
>>> +	 * which triggers this callback, may run on a CPU different from
>>> +	 * the task's assigned CPU.
>>> +	 *
>>> +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
>>> +	 * target CPU the task is going to use.
>>> +	 *
>>>    	 * See ->runnable() for explanation on the task state notifiers.
>>>    	 */
>>>    	void (*running)(struct task_struct *p);
>>> @@ -377,6 +386,15 @@ struct sched_ext_ops {
>>>    	 * @p: task stopping to run
>>>    	 * @runnable: is task @p still runnable?
>>>    	 *
>>> +	 * Note that this callback may be called from a CPU other than the
>>> +	 * one the task was running on. This can happen when a task
>>> +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
>>> +	 * which triggers this callback, may run on a CPU different from
>>> +	 * the task's assigned CPU.
>>> +	 *
>>> +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
>>> +	 * the task was running on.
>>> +	 *
>>>    	 * See ->runnable() for explanation on the task state notifiers. If
>>>    	 * !@runnable, ->quiescent() will be invoked after this operation
>>>    	 * returns.
>>
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-25  4:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23 21:02 [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks Andrea Righi
2025-04-23 23:06 ` Changwoo Min
2025-04-24  5:26   ` Andrea Righi
2025-04-25  4:29     ` Changwoo Min
2025-04-23 23:35 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.