* [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers
@ 2024-09-27 23:46 Tejun Heo
2024-09-27 23:46 ` [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value Tejun Heo
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Tejun Heo @ 2024-09-27 23:46 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext
During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
migration is disabled. sched_ext schedulers may perform operations such as
direct dispatch from ->select_task_rq() path and it is useful for them to
know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
and end up assuming incorrectly that ->select_task_rq() was called for tasks
that are bound to a single CPU or migration disabled.
This patchset adds ENQUEUE_RQ_SELECTED to indicate whether
->select_task_rq() was called to ->enqueue_task() and propagate that to
sched_ext schedulers so that they can reliably detect whether
->select_task_rq() was skipped.
Peter, please let me know how 0001-0002 should be routed. If they get
applied to sched/urgent (when it opens), I'll pull it into
sched_ext/for-6.12-fixes and apply 0003 on top. If you'd prefer, I can take
all three through sched_ext/for-6.12-fixes too.
This patchset contains the following patches:
0001-sched-core-Make-select_task_rq-take-the-pointer-to-w.patch
0002-sched-core-Add-ENQUEUE_RQ_SELECTED-to-indicate-wheth.patch
0003-sched_ext-scx_qmap-Add-and-use-SCX_ENQ_CPU_SELECTED.patch
0001-0002 are sched/core patches to add ENQUEUE_RQ_SELECTED.
0003 makes sched_ext propagate the flag as SCX_ENQ_CPU_SELECTED and fix
scx_qmap using the new flag.
and can also be found in the following git branch:
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git scx-ENQUEUE_RQ_SELECTED
diffstat follows. Thanks.
kernel/sched/core.c | 21 ++++++++++++++-------
kernel/sched/ext.c | 1 +
kernel/sched/sched.h | 3 +++
tools/sched_ext/scx_qmap.bpf.c | 4 ++--
4 files changed, 20 insertions(+), 9 deletions(-)
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value
2024-09-27 23:46 [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
@ 2024-09-27 23:46 ` Tejun Heo
2024-09-28 0:38 ` David Vernet
2024-09-27 23:46 ` [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called Tejun Heo
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2024-09-27 23:46 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext, Tejun Heo
This will be used to allow select_task_rq() to indicate whether
->select_task_rq() was called by modifying *wake_flags.
This makes try_to_wake_up() call all functions that take wake_flags with
WF_TTWU set. Previously, only select_task_rq() was. Using the same flags is
more consistent, and, as the flag is only tested by ->select_task_rq()
implementations, it doesn't cause any behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 43e453ab7e20..e70b57a5693e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3518,12 +3518,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
* The caller (fork, wakeup) owns p->pi_lock, ->cpus_ptr is stable.
*/
static inline
-int select_task_rq(struct task_struct *p, int cpu, int wake_flags)
+int select_task_rq(struct task_struct *p, int cpu, int *wake_flags)
{
lockdep_assert_held(&p->pi_lock);
if (p->nr_cpus_allowed > 1 && !is_migration_disabled(p))
- cpu = p->sched_class->select_task_rq(p, cpu, wake_flags);
+ cpu = p->sched_class->select_task_rq(p, cpu, *wake_flags);
else
cpu = cpumask_any(p->cpus_ptr);
@@ -4120,6 +4120,8 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
guard(preempt)();
int cpu, success = 0;
+ wake_flags |= WF_TTWU;
+
if (p == current) {
/*
* We're waking current, this means 'p->on_rq' and 'task_cpu(p)
@@ -4252,7 +4254,7 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
*/
smp_cond_load_acquire(&p->on_cpu, !VAL);
- cpu = select_task_rq(p, p->wake_cpu, wake_flags | WF_TTWU);
+ cpu = select_task_rq(p, p->wake_cpu, &wake_flags);
if (task_cpu(p) != cpu) {
if (p->in_iowait) {
delayacct_blkio_end(p);
@@ -4793,6 +4795,7 @@ void wake_up_new_task(struct task_struct *p)
{
struct rq_flags rf;
struct rq *rq;
+ int wake_flags = WF_FORK;
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
WRITE_ONCE(p->__state, TASK_RUNNING);
@@ -4807,7 +4810,7 @@ void wake_up_new_task(struct task_struct *p)
*/
p->recent_used_cpu = task_cpu(p);
rseq_migrate(p);
- __set_task_cpu(p, select_task_rq(p, task_cpu(p), WF_FORK));
+ __set_task_cpu(p, select_task_rq(p, task_cpu(p), &wake_flags));
#endif
rq = __task_rq_lock(p, &rf);
update_rq_clock(rq);
@@ -4815,7 +4818,7 @@ void wake_up_new_task(struct task_struct *p)
activate_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_INITIAL);
trace_sched_wakeup_new(p);
- wakeup_preempt(rq, p, WF_FORK);
+ wakeup_preempt(rq, p, wake_flags);
#ifdef CONFIG_SMP
if (p->sched_class->task_woken) {
/*
--
2.46.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-09-27 23:46 [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
2024-09-27 23:46 ` [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value Tejun Heo
@ 2024-09-27 23:46 ` Tejun Heo
2024-09-28 0:38 ` David Vernet
2024-10-01 20:12 ` Tejun Heo
2024-09-27 23:46 ` [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED Tejun Heo
2024-10-07 20:20 ` [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
3 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2024-09-27 23:46 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext, Tejun Heo
During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
migration is disabled. sched_ext schedulers may perform operations such as
direct dispatch from ->select_task_rq() path and it is useful for them to
know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
and end up assuming incorrectly that ->select_task_rq() was called for tasks
that are bound to a single CPU or migration disabled.
Make select_task_rq() indicate whether ->select_task_rq() was called by
setting WF_RQ_SELECTED in *wake_flags and make ttwu_do_activate() map that
to ENQUEUE_RQ_SELECTED for ->enqueue_task().
This will be used by sched_ext to fix ->select_task_rq() skip detection.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/core.c | 8 ++++++--
kernel/sched/sched.h | 3 +++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e70b57a5693e..aeb595514461 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3522,10 +3522,12 @@ int select_task_rq(struct task_struct *p, int cpu, int *wake_flags)
{
lockdep_assert_held(&p->pi_lock);
- if (p->nr_cpus_allowed > 1 && !is_migration_disabled(p))
+ if (p->nr_cpus_allowed > 1 && !is_migration_disabled(p)) {
cpu = p->sched_class->select_task_rq(p, cpu, *wake_flags);
- else
+ *wake_flags |= WF_RQ_SELECTED;
+ } else {
cpu = cpumask_any(p->cpus_ptr);
+ }
/*
* In order not to call set_task_cpu() on a blocking task we need
@@ -3659,6 +3661,8 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
rq->nr_uninterruptible--;
#ifdef CONFIG_SMP
+ if (wake_flags & WF_RQ_SELECTED)
+ en_flags |= ENQUEUE_RQ_SELECTED;
if (wake_flags & WF_MIGRATED)
en_flags |= ENQUEUE_MIGRATED;
else
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b1c3588a8f00..6085ef50febf 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2292,6 +2292,7 @@ static inline int task_on_rq_migrating(struct task_struct *p)
#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
#define WF_CURRENT_CPU 0x40 /* Prefer to move the wakee to the current CPU. */
+#define WF_RQ_SELECTED 0x80 /* ->select_task_rq() was called */
#ifdef CONFIG_SMP
static_assert(WF_EXEC == SD_BALANCE_EXEC);
@@ -2334,6 +2335,7 @@ extern const u32 sched_prio_to_wmult[40];
* ENQUEUE_HEAD - place at front of runqueue (tail if not specified)
* ENQUEUE_REPLENISH - CBS (replenish runtime and postpone deadline)
* ENQUEUE_MIGRATED - the task was migrated during wakeup
+ * ENQUEUE_RQ_SELECTED - ->select_task_rq() was called
*
*/
@@ -2360,6 +2362,7 @@ extern const u32 sched_prio_to_wmult[40];
#define ENQUEUE_INITIAL 0x80
#define ENQUEUE_MIGRATING 0x100
#define ENQUEUE_DELAYED 0x200
+#define ENQUEUE_RQ_SELECTED 0x400
#define RETRY_TASK ((void *)-1UL)
--
2.46.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
2024-09-27 23:46 [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
2024-09-27 23:46 ` [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value Tejun Heo
2024-09-27 23:46 ` [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called Tejun Heo
@ 2024-09-27 23:46 ` Tejun Heo
2024-09-28 0:39 ` David Vernet
2024-09-28 13:21 ` Andrea Righi
2024-10-07 20:20 ` [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
3 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2024-09-27 23:46 UTC (permalink / raw)
To: void, peterz, mingo
Cc: kernel-team, linux-kernel, sched-ext, Tejun Heo, Daniel Hodges,
Changwoo Min, Andrea Righi, Dan Schatzberg
scx_qmap and other schedulers in the SCX repo are using SCX_ENQ_WAKEUP to
tell whether ops.select_cpu() was called. This is incorrect as
ops.select_cpu() can be skipped in the wakeup path and leads to e.g.
incorrectly skipping direct dispatch for tasks that are bound to a single
CPU.
sched core has been udpated to specify ENQUEUE_RQ_SELECTED if
->select_task_rq() was called. Map it to SCX_ENQ_CPU_SELECTED and update
scx_qmap to test it instead of SCX_ENQ_WAKEUP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>
Cc: Daniel Hodges <hodges.daniel.scott@gmail.com>
Cc: Changwoo Min <multics69@gmail.com>
Cc: Andrea Righi <andrea.righi@linux.dev>
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
---
kernel/sched/ext.c | 1 +
tools/sched_ext/scx_qmap.bpf.c | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index c09e3dc38c34..9f00c8b629f1 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -691,6 +691,7 @@ enum scx_enq_flags {
/* expose select ENQUEUE_* flags as enums */
SCX_ENQ_WAKEUP = ENQUEUE_WAKEUP,
SCX_ENQ_HEAD = ENQUEUE_HEAD,
+ SCX_ENQ_CPU_SELECTED = ENQUEUE_RQ_SELECTED,
/* high 32bits are SCX specific */
diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c
index 83c8f54c1e31..588b7dce44fa 100644
--- a/tools/sched_ext/scx_qmap.bpf.c
+++ b/tools/sched_ext/scx_qmap.bpf.c
@@ -230,8 +230,8 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
return;
}
- /* if !WAKEUP, select_cpu() wasn't called, try direct dispatch */
- if (!(enq_flags & SCX_ENQ_WAKEUP) &&
+ /* if select_cpu() wasn't called, try direct dispatch */
+ if (!(enq_flags & SCX_ENQ_CPU_SELECTED) &&
(cpu = pick_direct_dispatch_cpu(p, scx_bpf_task_cpu(p))) >= 0) {
__sync_fetch_and_add(&nr_ddsp_from_enq, 1);
scx_bpf_dispatch(p, SCX_DSQ_LOCAL_ON | cpu, slice_ns, enq_flags);
--
2.46.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value
2024-09-27 23:46 ` [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value Tejun Heo
@ 2024-09-28 0:38 ` David Vernet
0 siblings, 0 replies; 14+ messages in thread
From: David Vernet @ 2024-09-28 0:38 UTC (permalink / raw)
To: Tejun Heo; +Cc: peterz, mingo, kernel-team, linux-kernel, sched-ext
[-- Attachment #1: Type: text/plain, Size: 577 bytes --]
On Fri, Sep 27, 2024 at 01:46:11PM -1000, Tejun Heo wrote:
> This will be used to allow select_task_rq() to indicate whether
> ->select_task_rq() was called by modifying *wake_flags.
>
> This makes try_to_wake_up() call all functions that take wake_flags with
> WF_TTWU set. Previously, only select_task_rq() was. Using the same flags is
> more consistent, and, as the flag is only tested by ->select_task_rq()
> implementations, it doesn't cause any behavior differences.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-09-27 23:46 ` [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called Tejun Heo
@ 2024-09-28 0:38 ` David Vernet
2024-10-01 20:12 ` Tejun Heo
1 sibling, 0 replies; 14+ messages in thread
From: David Vernet @ 2024-09-28 0:38 UTC (permalink / raw)
To: Tejun Heo; +Cc: peterz, mingo, kernel-team, linux-kernel, sched-ext
[-- Attachment #1: Type: text/plain, Size: 970 bytes --]
On Fri, Sep 27, 2024 at 01:46:12PM -1000, Tejun Heo wrote:
> During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
> migration is disabled. sched_ext schedulers may perform operations such as
> direct dispatch from ->select_task_rq() path and it is useful for them to
> know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
>
> Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
> and end up assuming incorrectly that ->select_task_rq() was called for tasks
> that are bound to a single CPU or migration disabled.
>
> Make select_task_rq() indicate whether ->select_task_rq() was called by
> setting WF_RQ_SELECTED in *wake_flags and make ttwu_do_activate() map that
> to ENQUEUE_RQ_SELECTED for ->enqueue_task().
>
> This will be used by sched_ext to fix ->select_task_rq() skip detection.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
2024-09-27 23:46 ` [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED Tejun Heo
@ 2024-09-28 0:39 ` David Vernet
2024-09-28 13:21 ` Andrea Righi
1 sibling, 0 replies; 14+ messages in thread
From: David Vernet @ 2024-09-28 0:39 UTC (permalink / raw)
To: Tejun Heo
Cc: peterz, mingo, kernel-team, linux-kernel, sched-ext,
Daniel Hodges, Changwoo Min, Andrea Righi, Dan Schatzberg
[-- Attachment #1: Type: text/plain, Size: 673 bytes --]
On Fri, Sep 27, 2024 at 01:46:13PM -1000, Tejun Heo wrote:
> scx_qmap and other schedulers in the SCX repo are using SCX_ENQ_WAKEUP to
> tell whether ops.select_cpu() was called. This is incorrect as
> ops.select_cpu() can be skipped in the wakeup path and leads to e.g.
> incorrectly skipping direct dispatch for tasks that are bound to a single
> CPU.
>
> sched core has been udpated to specify ENQUEUE_RQ_SELECTED if
s/udpated/updated
> ->select_task_rq() was called. Map it to SCX_ENQ_CPU_SELECTED and update
> scx_qmap to test it instead of SCX_ENQ_WAKEUP.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
2024-09-27 23:46 ` [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED Tejun Heo
2024-09-28 0:39 ` David Vernet
@ 2024-09-28 13:21 ` Andrea Righi
2024-09-28 16:52 ` Tejun Heo
1 sibling, 1 reply; 14+ messages in thread
From: Andrea Righi @ 2024-09-28 13:21 UTC (permalink / raw)
To: Tejun Heo
Cc: void, peterz, mingo, kernel-team, linux-kernel, sched-ext,
Daniel Hodges, Changwoo Min, Dan Schatzberg
On Fri, Sep 27, 2024 at 01:46:13PM -1000, Tejun Heo wrote:
> scx_qmap and other schedulers in the SCX repo are using SCX_ENQ_WAKEUP to
> tell whether ops.select_cpu() was called. This is incorrect as
> ops.select_cpu() can be skipped in the wakeup path and leads to e.g.
> incorrectly skipping direct dispatch for tasks that are bound to a single
> CPU.
>
> sched core has been udpated to specify ENQUEUE_RQ_SELECTED if
> ->select_task_rq() was called. Map it to SCX_ENQ_CPU_SELECTED and update
> scx_qmap to test it instead of SCX_ENQ_WAKEUP.
Even if it's quite convenient to have the SCX_ENQ_CPU_SELECTED flag
provided by kernel, I was wondering if we could just delegate this whole
logic to BPF and avoid introducing this extra flag in the kernel.
In theory we could track when ops.select_cpu() is called by setting a
flag in the BPF task context (BPF_MAP_TYPE_TASK_STORAGE). Specifically,
the flag could be set in ops.select_cpu() and cleared in ops.stopping().
Then, when ops.enqueue() is called, we can check the flag to determine
whether ops.select_cpu() was skipped or not.
Since most of the scx schedulers already implement their own task
context, this shouldn't add too much complexity/overhead to the BPF
code, it'd be fully backward-compatible and it doesn't depend on the
particular kernel logic that calls ->select_task_rq(). WDYT?
Thanks,
-Andrea
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: David Vernet <void@manifault.com>
> Cc: Daniel Hodges <hodges.daniel.scott@gmail.com>
> Cc: Changwoo Min <multics69@gmail.com>
> Cc: Andrea Righi <andrea.righi@linux.dev>
> Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
> ---
> kernel/sched/ext.c | 1 +
> tools/sched_ext/scx_qmap.bpf.c | 4 ++--
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index c09e3dc38c34..9f00c8b629f1 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -691,6 +691,7 @@ enum scx_enq_flags {
> /* expose select ENQUEUE_* flags as enums */
> SCX_ENQ_WAKEUP = ENQUEUE_WAKEUP,
> SCX_ENQ_HEAD = ENQUEUE_HEAD,
> + SCX_ENQ_CPU_SELECTED = ENQUEUE_RQ_SELECTED,
>
> /* high 32bits are SCX specific */
>
> diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c
> index 83c8f54c1e31..588b7dce44fa 100644
> --- a/tools/sched_ext/scx_qmap.bpf.c
> +++ b/tools/sched_ext/scx_qmap.bpf.c
> @@ -230,8 +230,8 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
> return;
> }
>
> - /* if !WAKEUP, select_cpu() wasn't called, try direct dispatch */
> - if (!(enq_flags & SCX_ENQ_WAKEUP) &&
> + /* if select_cpu() wasn't called, try direct dispatch */
> + if (!(enq_flags & SCX_ENQ_CPU_SELECTED) &&
> (cpu = pick_direct_dispatch_cpu(p, scx_bpf_task_cpu(p))) >= 0) {
> __sync_fetch_and_add(&nr_ddsp_from_enq, 1);
> scx_bpf_dispatch(p, SCX_DSQ_LOCAL_ON | cpu, slice_ns, enq_flags);
> --
> 2.46.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
2024-09-28 13:21 ` Andrea Righi
@ 2024-09-28 16:52 ` Tejun Heo
0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2024-09-28 16:52 UTC (permalink / raw)
To: Andrea Righi
Cc: void, peterz, mingo, kernel-team, linux-kernel, sched-ext,
Daniel Hodges, Changwoo Min, Dan Schatzberg
Hello,
On Sat, Sep 28, 2024 at 03:21:46PM +0200, Andrea Righi wrote:
...
> > sched core has been udpated to specify ENQUEUE_RQ_SELECTED if
> > ->select_task_rq() was called. Map it to SCX_ENQ_CPU_SELECTED and update
> > scx_qmap to test it instead of SCX_ENQ_WAKEUP.
>
> Even if it's quite convenient to have the SCX_ENQ_CPU_SELECTED flag
> provided by kernel, I was wondering if we could just delegate this whole
> logic to BPF and avoid introducing this extra flag in the kernel.
>
> In theory we could track when ops.select_cpu() is called by setting a
> flag in the BPF task context (BPF_MAP_TYPE_TASK_STORAGE). Specifically,
> the flag could be set in ops.select_cpu() and cleared in ops.stopping().
> Then, when ops.enqueue() is called, we can check the flag to determine
> whether ops.select_cpu() was skipped or not.
>
> Since most of the scx schedulers already implement their own task
> context, this shouldn't add too much complexity/overhead to the BPF
> code, it'd be fully backward-compatible and it doesn't depend on the
> particular kernel logic that calls ->select_task_rq(). WDYT?
Yeah, that would work too and probably what we should do to work around on
older kernels, but also it's a relatively obvious hole in the API and we
don't lose anything by updating the kernel to indicate the state.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-09-27 23:46 ` [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called Tejun Heo
2024-09-28 0:38 ` David Vernet
@ 2024-10-01 20:12 ` Tejun Heo
2024-10-04 20:14 ` Tejun Heo
1 sibling, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2024-10-01 20:12 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext
On Fri, Sep 27, 2024 at 01:46:12PM -1000, Tejun Heo wrote:
> During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
> migration is disabled. sched_ext schedulers may perform operations such as
> direct dispatch from ->select_task_rq() path and it is useful for them to
> know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
>
> Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
> and end up assuming incorrectly that ->select_task_rq() was called for tasks
> that are bound to a single CPU or migration disabled.
>
> Make select_task_rq() indicate whether ->select_task_rq() was called by
> setting WF_RQ_SELECTED in *wake_flags and make ttwu_do_activate() map that
> to ENQUEUE_RQ_SELECTED for ->enqueue_task().
>
> This will be used by sched_ext to fix ->select_task_rq() skip detection.
Peter, ping on patches 1-2. If they look okay, I can route them through
sched_ext/for-6.12-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-10-01 20:12 ` Tejun Heo
@ 2024-10-04 20:14 ` Tejun Heo
2024-10-05 9:38 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2024-10-04 20:14 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext
On Tue, Oct 01, 2024 at 10:12:53AM -1000, Tejun Heo wrote:
> On Fri, Sep 27, 2024 at 01:46:12PM -1000, Tejun Heo wrote:
> > During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
> > migration is disabled. sched_ext schedulers may perform operations such as
> > direct dispatch from ->select_task_rq() path and it is useful for them to
> > know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
> >
> > Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
> > and end up assuming incorrectly that ->select_task_rq() was called for tasks
> > that are bound to a single CPU or migration disabled.
> >
> > Make select_task_rq() indicate whether ->select_task_rq() was called by
> > setting WF_RQ_SELECTED in *wake_flags and make ttwu_do_activate() map that
> > to ENQUEUE_RQ_SELECTED for ->enqueue_task().
> >
> > This will be used by sched_ext to fix ->select_task_rq() skip detection.
>
> Peter, ping on patches 1-2. If they look okay, I can route them through
> sched_ext/for-6.12-fixes.
As the core changes are on the trivial side, I'll wait till early next week
and route them through sched_ext/for-6.12-fixes. Please holler for any
concerns.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-10-04 20:14 ` Tejun Heo
@ 2024-10-05 9:38 ` Peter Zijlstra
2024-10-07 16:44 ` Tejun Heo
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2024-10-05 9:38 UTC (permalink / raw)
To: Tejun Heo; +Cc: void, mingo, kernel-team, linux-kernel, sched-ext
On Fri, Oct 04, 2024 at 10:14:20AM -1000, Tejun Heo wrote:
> On Tue, Oct 01, 2024 at 10:12:53AM -1000, Tejun Heo wrote:
> > On Fri, Sep 27, 2024 at 01:46:12PM -1000, Tejun Heo wrote:
> > > During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
> > > migration is disabled. sched_ext schedulers may perform operations such as
> > > direct dispatch from ->select_task_rq() path and it is useful for them to
> > > know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
> > >
> > > Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
> > > and end up assuming incorrectly that ->select_task_rq() was called for tasks
> > > that are bound to a single CPU or migration disabled.
> > >
> > > Make select_task_rq() indicate whether ->select_task_rq() was called by
> > > setting WF_RQ_SELECTED in *wake_flags and make ttwu_do_activate() map that
> > > to ENQUEUE_RQ_SELECTED for ->enqueue_task().
> > >
> > > This will be used by sched_ext to fix ->select_task_rq() skip detection.
> >
> > Peter, ping on patches 1-2. If they look okay, I can route them through
> > sched_ext/for-6.12-fixes.
>
> As the core changes are on the trivial side, I'll wait till early next week
> and route them through sched_ext/for-6.12-fixes. Please holler for any
> concerns.
Right, I've been busy chasing merge window fallout, and this didn't seem
that urgent. I'm not exactly liking it, but it isn't too horrible. So
yeah, you can take it I suppose.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
2024-10-05 9:38 ` Peter Zijlstra
@ 2024-10-07 16:44 ` Tejun Heo
0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2024-10-07 16:44 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: void, mingo, kernel-team, linux-kernel, sched-ext
On Sat, Oct 05, 2024 at 11:38:04AM +0200, Peter Zijlstra wrote:
> Right, I've been busy chasing merge window fallout, and this didn't seem
> that urgent. I'm not exactly liking it, but it isn't too horrible. So
> yeah, you can take it I suppose.
Thanks for taking a look. Will route them through sched_ext/for-6.12-fixes.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers
2024-09-27 23:46 [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
` (2 preceding siblings ...)
2024-09-27 23:46 ` [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED Tejun Heo
@ 2024-10-07 20:20 ` Tejun Heo
3 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2024-10-07 20:20 UTC (permalink / raw)
To: void, peterz, mingo; +Cc: kernel-team, linux-kernel, sched-ext
On Fri, Sep 27, 2024 at 01:46:10PM -1000, Tejun Heo wrote:
> During ttwu, ->select_task_rq() can be skipped if only one CPU is allowed or
> migration is disabled. sched_ext schedulers may perform operations such as
> direct dispatch from ->select_task_rq() path and it is useful for them to
> know whether ->select_task_rq() was skipped in the ->enqueue_task() path.
>
> Currently, sched_ext schedulers are using ENQUEUE_WAKEUP for this purpose
> and end up assuming incorrectly that ->select_task_rq() was called for tasks
> that are bound to a single CPU or migration disabled.
>
> This patchset adds ENQUEUE_RQ_SELECTED to indicate whether
> ->select_task_rq() was called to ->enqueue_task() and propagate that to
> sched_ext schedulers so that they can reliably detect whether
> ->select_task_rq() was skipped.
>
> Peter, please let me know how 0001-0002 should be routed. If they get
> applied to sched/urgent (when it opens), I'll pull it into
> sched_ext/for-6.12-fixes and apply 0003 on top. If you'd prefer, I can take
> all three through sched_ext/for-6.12-fixes too.
>
> This patchset contains the following patches:
>
> 0001-sched-core-Make-select_task_rq-take-the-pointer-to-w.patch
> 0002-sched-core-Add-ENQUEUE_RQ_SELECTED-to-indicate-wheth.patch
> 0003-sched_ext-scx_qmap-Add-and-use-SCX_ENQ_CPU_SELECTED.patch
>
> 0001-0002 are sched/core patches to add ENQUEUE_RQ_SELECTED.
>
> 0003 makes sched_ext propagate the flag as SCX_ENQ_CPU_SELECTED and fix
> scx_qmap using the new flag.
Applied 1-3 to sched_ext/for-6.12-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-10-07 20:20 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-27 23:46 [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
2024-09-27 23:46 ` [PATCH 1/3] sched/core: Make select_task_rq() take the pointer to wake_flags instead of value Tejun Heo
2024-09-28 0:38 ` David Vernet
2024-09-27 23:46 ` [PATCH 2/3] sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called Tejun Heo
2024-09-28 0:38 ` David Vernet
2024-10-01 20:12 ` Tejun Heo
2024-10-04 20:14 ` Tejun Heo
2024-10-05 9:38 ` Peter Zijlstra
2024-10-07 16:44 ` Tejun Heo
2024-09-27 23:46 ` [PATCH 3/3] sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED Tejun Heo
2024-09-28 0:39 ` David Vernet
2024-09-28 13:21 ` Andrea Righi
2024-09-28 16:52 ` Tejun Heo
2024-10-07 20:20 ` [PATCHSET sched/urgent, sched_ext/for-6.12-fixes] sched/core, sched_ext: Add ENQUEUE_RQ_SELECTED to fix ->select_task_rq() skip detection in sched_ext schedulers Tejun Heo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.