From: Peter Zijlstra <peterz@infradead.org>
To: Tejun Heo <tj@kernel.org>
Cc: torvalds@linux-foundation.org, mingo@redhat.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
vschneid@redhat.com, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@kernel.org, joshdon@google.com,
brho@google.com, pjt@google.com, derkling@google.com,
haoluo@google.com, dvernet@meta.com, dschatzberg@meta.com,
dskarlat@cs.cmu.edu, riel@surriel.com,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class
Date: Mon, 12 Dec 2022 13:53:55 +0100 [thread overview]
Message-ID: <Y5ckYyz14bxCvv40@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20221130082313.3241517-15-tj@kernel.org>
On Tue, Nov 29, 2022 at 10:22:56PM -1000, Tejun Heo wrote:
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index d06ada2341cb..cfbfc47692eb 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -131,6 +131,7 @@
> *(__dl_sched_class) \
> *(__rt_sched_class) \
> *(__fair_sched_class) \
> + *(__ext_sched_class) \
> *(__idle_sched_class) \
> __sched_class_lowest = .;
>
> @@ -9654,8 +9675,13 @@ void __init sched_init(void)
> int i;
>
> /* Make sure the linker didn't screw up */
> - BUG_ON(&idle_sched_class != &fair_sched_class + 1 ||
> - &fair_sched_class != &rt_sched_class + 1 ||
> +#ifdef CONFIG_SCHED_CLASS_EXT
> + BUG_ON(&idle_sched_class != &ext_sched_class + 1 ||
> + &ext_sched_class != &fair_sched_class + 1);
> +#else
> + BUG_ON(&idle_sched_class != &fair_sched_class + 1);
> +#endif
> + BUG_ON(&fair_sched_class != &rt_sched_class + 1 ||
> &rt_sched_class != &dl_sched_class + 1);
> #ifdef CONFIG_SMP
> BUG_ON(&dl_sched_class != &stop_sched_class + 1);
Perhaps the saner way to write this is:
#ifdef CONFIG_SMP
BUG_ON(!sched_class_above(&stop_sched_class, &dl_sched_class));
#endif
BUG_ON(!sched_class_above(&dl_sched_class, &rt_sched_class));
BUG_ON(!sched_class_above(&rt_sched_class, &fair_sched_class));
BUG_ON(!sched_class_above(&fair_sched_class, &idle_sched_class));
#ifdef CONFIG_...
BUG_ON(!sched_class_above(&fair_sched_class, &ext_sched_class));
BUG_ON(!sched_class_above(&ext_sched_class, &idle_sched_class));
#endif
> +static inline const struct sched_class *next_active_class(const struct sched_class *class)
> +{
> + class++;
> + if (!scx_enabled() && class == &ext_sched_class)
> + class++;
> + return class;
> +}
> +
> +#define for_active_class_range(class, _from, _to) \
> + for (class = (_from); class != (_to); class = next_active_class(class))
> +
> +#define for_each_active_class(class) \
> + for_active_class_range(class, __sched_class_highest, __sched_class_lowest)
> +
> +/*
> + * SCX requires a balance() call before every pick_next_task() call including
> + * when waking up from idle.
> + */
> +#define for_balance_class_range(class, prev_class, end_class) \
> + for_active_class_range(class, (prev_class) > &ext_sched_class ? \
> + &ext_sched_class : (prev_class), (end_class))
> +
This seems quite insane; why not simply make the ext methods effective
no-ops? Both balance and pick explicitly support that already, no?
> @@ -5800,10 +5812,13 @@ static void put_prev_task_balance(struct rq *rq, struct task_struct *prev,
> * We can terminate the balance pass as soon as we know there is
> * a runnable task of @class priority or higher.
> */
> - for_class_range(class, prev->sched_class, &idle_sched_class) {
> + for_balance_class_range(class, prev->sched_class, &idle_sched_class) {
> if (class->balance(rq, prev, rf))
> break;
> }
> +#else
> + /* SCX needs the balance call even in UP, call it explicitly */
This, *WHY* !?!
> + balance_scx_on_up(rq, prev, rf);
> #endif
>
> put_prev_task(rq, prev);
> @@ -5818,6 +5833,9 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> const struct sched_class *class;
> struct task_struct *p;
>
> + if (scx_enabled())
> + goto restart;
> +
> /*
> * Optimization: we know that if all tasks are in the fair class we can
> * call that function directly, but only if the @prev task wasn't of a
> @@ -5843,7 +5861,7 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> restart:
> put_prev_task_balance(rq, prev, rf);
>
> - for_each_class(class) {
> + for_each_active_class(class) {
> p = class->pick_next_task(rq);
> if (p)
> return p;
> @@ -5876,7 +5894,7 @@ static inline struct task_struct *pick_task(struct rq *rq)
> const struct sched_class *class;
> struct task_struct *p;
>
> - for_each_class(class) {
> + for_each_active_class(class) {
> p = class->pick_task(rq);
> if (p)
> return p;
But this.. afaict that means that:
- the whole EXT thing is incompatible with SCHED_CORE
- the whole EXT thing can be trivially starved by the presence of a
single CFS/BATCH/IDLE task.
Both seems like deal breakers.
next prev parent reply other threads:[~2022-12-12 12:54 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-30 8:22 [PATCHSET RFC] sched: Implement BPF extensible scheduler class Tejun Heo
2022-11-30 8:22 ` [PATCH 01/31] rhashtable: Allow rhashtable to be used from irq-safe contexts Tejun Heo
2022-11-30 16:35 ` Linus Torvalds
2022-11-30 17:00 ` Tejun Heo
2022-12-06 21:36 ` [PATCH v2 " Tejun Heo
2022-12-09 10:50 ` patchwork-bot+netdevbpf
2022-11-30 8:22 ` [PATCH 02/31] cgroup: Implement cgroup_show_cftypes() Tejun Heo
2022-11-30 8:22 ` [PATCH 03/31] BPF: Add @prog to bpf_struct_ops->check_member() Tejun Heo
2022-11-30 8:22 ` [PATCH 04/31] sched: Allow sched_cgroup_fork() to fail and introduce sched_cancel_fork() Tejun Heo
2022-12-12 11:13 ` Peter Zijlstra
2022-12-12 18:03 ` Tejun Heo
2022-12-12 20:07 ` Peter Zijlstra
2022-12-12 20:12 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 05/31] sched: Add sched_class->reweight_task() Tejun Heo
2022-12-12 11:22 ` Peter Zijlstra
2022-12-12 17:34 ` Tejun Heo
2022-12-12 20:11 ` Peter Zijlstra
2022-12-12 20:15 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 06/31] sched: Add sched_class->switching_to() and expose check_class_changing/changed() Tejun Heo
2022-12-12 11:28 ` Peter Zijlstra
2022-12-12 17:59 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 07/31] sched: Factor out cgroup weight conversion functions Tejun Heo
2022-11-30 8:22 ` [PATCH 08/31] sched: Expose css_tg() and __setscheduler_prio() in kernel/sched/sched.h Tejun Heo
2022-12-12 11:49 ` Peter Zijlstra
2022-12-12 17:47 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 09/31] sched: Enumerate CPU cgroup file types Tejun Heo
2022-11-30 8:22 ` [PATCH 10/31] sched: Add @reason to sched_class->rq_{on|off}line() Tejun Heo
2022-12-12 11:57 ` Peter Zijlstra
2022-12-12 18:06 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 11/31] sched: Add @reason to sched_move_task() Tejun Heo
2022-12-12 12:00 ` Peter Zijlstra
2022-12-12 17:54 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 12/31] sched: Add normal_policy() Tejun Heo
2022-11-30 8:22 ` [PATCH 13/31] sched_ext: Add boilerplate for extensible scheduler class Tejun Heo
2022-11-30 8:22 ` [PATCH 14/31] sched_ext: Implement BPF " Tejun Heo
2022-12-02 17:08 ` Barret Rhoden
2022-12-02 18:01 ` Tejun Heo
2022-12-06 21:42 ` Tejun Heo
2022-12-06 21:44 ` Tejun Heo
2022-12-11 22:33 ` Julia Lawall
2022-12-12 2:15 ` Tejun Heo
2022-12-12 6:03 ` Julia Lawall
2022-12-12 6:08 ` Tejun Heo
2022-12-12 12:31 ` Peter Zijlstra
2022-12-12 20:03 ` Tejun Heo
2022-12-12 12:53 ` Peter Zijlstra [this message]
2022-12-12 21:33 ` Tejun Heo
2022-12-13 10:55 ` Peter Zijlstra
2022-12-13 18:12 ` Tejun Heo
2022-12-13 18:40 ` Rik van Riel
2022-12-13 23:20 ` Josh Don
2022-12-13 10:57 ` Peter Zijlstra
2022-12-13 17:32 ` Tejun Heo
2022-11-30 8:22 ` [PATCH 15/31] sched_ext: [TEMPORARY] Add temporary workaround kfunc helpers Tejun Heo
2022-11-30 8:22 ` [PATCH 16/31] sched_ext: Add scx_example_dummy and scx_example_qmap example schedulers Tejun Heo
2022-11-30 8:22 ` [PATCH 17/31] sched_ext: Add sysrq-S which disables the BPF scheduler Tejun Heo
2022-11-30 8:23 ` [PATCH 18/31] sched_ext: Implement runnable task stall watchdog Tejun Heo
2022-11-30 8:23 ` [PATCH 19/31] sched_ext: Allow BPF schedulers to disallow specific tasks from joining SCHED_EXT Tejun Heo
2022-11-30 8:23 ` [PATCH 20/31] sched_ext: Allow BPF schedulers to switch all eligible tasks into sched_ext Tejun Heo
2022-11-30 8:23 ` [PATCH 21/31] sched_ext: Implement scx_bpf_kick_cpu() and task preemption support Tejun Heo
2022-11-30 8:23 ` [PATCH 22/31] sched_ext: Add task state tracking operations Tejun Heo
2022-11-30 8:23 ` [PATCH 23/31] sched_ext: Implement tickless support Tejun Heo
2022-11-30 8:23 ` [PATCH 24/31] sched_ext: Add cgroup support Tejun Heo
2022-11-30 8:23 ` [PATCH 25/31] sched_ext: Implement SCX_KICK_WAIT Tejun Heo
2022-11-30 8:23 ` [PATCH 26/31] sched_ext: Implement sched_ext_ops.cpu_acquire/release() Tejun Heo
2022-11-30 8:23 ` [PATCH 27/31] sched_ext: Implement sched_ext_ops.cpu_online/offline() Tejun Heo
2022-11-30 8:23 ` [PATCH 28/31] sched_ext: Add Documentation/scheduler/sched-ext.rst Tejun Heo
2022-12-12 4:01 ` Bagas Sanjaya
2022-12-12 6:28 ` Tejun Heo
2022-12-12 13:07 ` Bagas Sanjaya
2022-12-12 17:30 ` Tejun Heo
2022-12-12 12:39 ` Peter Zijlstra
2022-12-12 17:16 ` Tejun Heo
2022-11-30 8:23 ` [PATCH 29/31] sched_ext: Add a basic, userland vruntime scheduler Tejun Heo
2022-11-30 8:23 ` [PATCH 30/31] BPF: [TEMPORARY] Nerf BTF scalar value check Tejun Heo
2022-11-30 8:23 ` [PATCH 31/31] sched_ext: Add a rust userspace hybrid example scheduler Tejun Heo
2022-12-12 14:03 ` Peter Zijlstra
2022-12-12 21:05 ` Peter Oskolkov
2022-12-13 11:02 ` Peter Zijlstra
2022-12-13 18:24 ` Peter Oskolkov
2022-12-12 22:00 ` Tejun Heo
2022-12-12 22:18 ` Josh Don
2022-12-13 11:30 ` Peter Zijlstra
2022-12-13 20:33 ` Tejun Heo
2022-12-14 2:00 ` Josh Don
2022-12-12 9:37 ` [PATCHSET RFC] sched: Implement BPF extensible scheduler class Peter Zijlstra
2022-12-12 17:27 ` Tejun Heo
2022-12-12 10:14 ` Peter Zijlstra
2022-12-14 2:11 ` Josh Don
2022-12-14 8:55 ` Peter Zijlstra
2022-12-14 22:23 ` Tejun Heo
2022-12-14 23:20 ` Barret Rhoden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y5ckYyz14bxCvv40@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=daniel@iogearbox.net \
--cc=derkling@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=dschatzberg@meta.com \
--cc=dskarlat@cs.cmu.edu \
--cc=dvernet@meta.com \
--cc=haoluo@google.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pjt@google.com \
--cc=riel@surriel.com \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox