From: Steven Rostedt <rostedt@goodmis.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>,
torvalds@linux-foundation.org, mingo@redhat.com,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com, vschneid@redhat.com, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org,
joshdon@google.com, brho@google.com, pjt@google.com,
derkling@google.com, haoluo@google.com, dvernet@meta.com,
dschatzberg@meta.com, dskarlat@cs.cmu.edu, riel@surriel.com,
changwoo@igalia.com, himadrics@inria.fr, memxor@gmail.com,
andrea.righi@canonical.com, joel@joelfernandes.org,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCHSET v6] sched: Implement BPF extensible scheduler class
Date: Mon, 13 May 2024 14:26:46 -0400 [thread overview]
Message-ID: <20240513142646.4dc5484d@rorschach.local.home> (raw)
In-Reply-To: <20240513080359.GI30852@noisy.programming.kicks-ass.net>
On Mon, 13 May 2024 10:03:59 +0200
Peter Zijlstra <peterz@infradead.org> wrote:
> > I believe we agree that we want more people contributing to the scheduling
> > area.
>
> I think therein lies the rub -- contribution. If we were to do this
> thing, random loadable BPF schedulers, then how do we ensure people will
> contribute back?
Hi Peter,
I'm somewhat agnostic to sched_ext itself, but I have been an advocate
for a plugable scheduler infrastructure. And we are seriously looking
at adding it to ChromeOS.
>
> That is, from where I am sitting I see $vendor mandate their $enterprise
> product needs their $BPF scheduler. At which point $vendor will have no
> incentive to ever contribute back.
Believe me they already have their own scheduler, and because its so
different, it's very hard to contribute back.
>
> And customers of $vendor that want to run additional workloads on
> their machine are then stuck with that scheduler, irrespective of it
> being suitable for them or not. This is not a good experience.
And $vendor usually has a unique workload that their changes will
likely cause regressions in other workloads, making it even harder to
contribute back.
>
> So I don't at all mind people playing around with schedulers -- they can
> do so today, there are a ton of out of tree patches to start or learn
> from, or like I said, it really isn't all that hard to just rip out fair
> and write something new.
For cloud servers, I bet a lot of schedulers are not public. Although,
my company tries to publish the schedulers they use.
>
> Open source, you get to do your own thing. Have at.
>
> But part of what made Linux work so well, is in my opinion the GPL. GPL
> forces people to contribute back -- to work on the shared project. And I
> see the whole BPF thing as a run-around on that.
>
> Even the large cloud vendors and service providers (Amazon, Google,
> Facebook etc.) contribute back because of rebase pain -- as you well
> know. The rebase pain offsets the 'TIVO hole'.
From what I understand (I don't work on production, but Chromebooks), a
lot of changes cannot be contributed back because their updates are far
from what is upstream. Having a plugable scheduler would actually allow
them to contribute *more*.
>
> But with the BPF muck; where is the motivation to help improve things?
For the same reasons you mention about GPL and why it works.
Collaboration. Sharing ideas helps everyone. If there's some secret
sauce scheduler then they would likely just replace the scheduler, as
its more performant. I don't believe it would be worth while to use BPF
for that purpose.
>
> Keeping a rando github repo with BPF schedulers is not contributing.
Agreed, and I would guess having them in the Linux kernel tree would be
more beneficial.
> That's just a repo with multiple out of tree schedulers to be ignored.
> Who will put in the effort of upsteaming things if they can hack up a
> BPF and throw it over the wall?
If there's a place in the Linux kernel tree, I'm sure there would be
motivation to place it there. Having it in the kernel proper does give
more visibility of code, and therefore enhancements to that code. This
was the same rationale for putting perf into the kernel proper.
>
> So yeah, I'm very much NOT supportive of this effort. From where I'm
> sitting there is simply not a single benefit. You're not making my life
> better, so why would I care?
>
> How does this BPF muck translate into better quality patches for me?
Here's how we will be using it (we will likely be porting sched_ext to
ChromeOS regardless of its acceptance).
Doing testing of scheduler changes in the field is extremely time
consuming and complex. We tested EEVDF vs CFS by backporting EEVDF to
5.15 (as that is the kernel version we are using on the chromebooks we
were testing on), and then we need to add a user space "switch" to
change the scheduler. Note, this also risks causing a bug in adding
these changes. Then we push the kernel out, and then start our
experiment that enables our feature to a small percentage, and slowly
increases the number of users until we have a enough for a statistical
result.
What sched_ext would give us is a easy way to try different scheduling
algorithms and get feedback much quicker. Once we determine a solution
that improves things, we would then spend the time to implement it in
the scheduler, and yes, send it upstream.
To me, sched_ext should never be the final solution, but it can be
extremely useful in testing various changes quickly in the field. Which
to me would encourage more contributions.
-- Steve
next prev parent reply other threads:[~2024-05-13 18:26 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-01 15:09 [PATCHSET v6] sched: Implement BPF extensible scheduler class Tejun Heo
2024-05-01 15:09 ` [PATCH 01/39] cgroup: Implement cgroup_show_cftypes() Tejun Heo
2024-05-01 15:09 ` [PATCH 02/39] sched: Restructure sched_class order sanity checks in sched_init() Tejun Heo
2024-05-01 15:09 ` [PATCH 03/39] sched: Allow sched_cgroup_fork() to fail and introduce sched_cancel_fork() Tejun Heo
2024-05-01 15:09 ` [PATCH 04/39] sched: Add sched_class->reweight_task() Tejun Heo
2024-06-24 10:23 ` Peter Zijlstra
2024-06-24 10:31 ` Peter Zijlstra
2024-06-24 23:59 ` Tejun Heo
2024-06-25 7:29 ` Peter Zijlstra
2024-06-25 23:57 ` Tejun Heo
2024-06-26 1:29 ` [PATCH sched/urgent] sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks Tejun Heo
2024-06-26 2:19 ` [PATCH sched_ext/for-6.11] sched_ext: Account for idle policy when setting p->scx.weight in scx_ops_enable_task() Tejun Heo
2024-07-08 19:29 ` [PATCH v2 " Tejun Heo
2024-07-08 15:07 ` [tip: sched/core] sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks tip-bot2 for Tejun Heo
2024-05-01 15:09 ` [PATCH 05/39] sched: Add sched_class->switching_to() and expose check_class_changing/changed() Tejun Heo
2024-06-24 11:06 ` Peter Zijlstra
2024-06-24 22:18 ` Tejun Heo
2024-06-25 8:16 ` Peter Zijlstra
2024-05-01 15:09 ` [PATCH 06/39] sched: Factor out cgroup weight conversion functions Tejun Heo
2024-05-01 15:09 ` [PATCH 07/39] sched: Expose css_tg() and __setscheduler_prio() Tejun Heo
2024-06-24 11:19 ` Peter Zijlstra
2024-06-24 18:56 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 08/39] sched: Enumerate CPU cgroup file types Tejun Heo
2024-05-01 15:09 ` [PATCH 09/39] sched: Add @reason to sched_class->rq_{on|off}line() Tejun Heo
2024-06-24 11:32 ` Peter Zijlstra
2024-06-24 21:18 ` Tejun Heo
2024-06-25 8:29 ` Peter Zijlstra
2024-06-25 23:41 ` Tejun Heo
2024-06-26 8:23 ` Peter Zijlstra
2024-06-26 18:01 ` Tejun Heo
2024-06-27 1:27 ` [PATCH sched_ext/for-6.11] sched_ext: Disallow loading BPF scheduler if isolcpus= domain isolation is in effect Tejun Heo
2024-07-08 19:30 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 10/39] sched: Factor out update_other_load_avgs() from __update_blocked_others() Tejun Heo
2024-06-24 12:35 ` Peter Zijlstra
2024-06-24 16:15 ` Vincent Guittot
2024-06-24 19:24 ` Tejun Heo
2024-06-25 9:13 ` Vincent Guittot
2024-06-26 20:49 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 11/39] cpufreq_schedutil: Refactor sugov_cpu_is_busy() Tejun Heo
2024-05-01 15:09 ` [PATCH 12/39] sched: Add normal_policy() Tejun Heo
2024-05-01 15:09 ` [PATCH 13/39] sched_ext: Add boilerplate for extensible scheduler class Tejun Heo
2024-05-01 15:09 ` [PATCH 14/39] sched_ext: Implement BPF " Tejun Heo
2024-05-01 15:09 ` [PATCH 15/39] sched_ext: Add scx_simple and scx_example_qmap example schedulers Tejun Heo
2024-05-01 15:09 ` [PATCH 16/39] sched_ext: Add sysrq-S which disables the BPF scheduler Tejun Heo
2024-05-01 15:09 ` [PATCH 17/39] sched_ext: Implement runnable task stall watchdog Tejun Heo
2024-05-01 15:09 ` [PATCH 18/39] sched_ext: Allow BPF schedulers to disallow specific tasks from joining SCHED_EXT Tejun Heo
2024-06-24 12:40 ` Peter Zijlstra
2024-06-24 19:06 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 19/39] sched_ext: Print sched_ext info when dumping stack Tejun Heo
2024-06-24 12:46 ` Peter Zijlstra
2024-06-24 14:25 ` Linus Torvalds
2024-06-24 18:34 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 20/39] sched_ext: Print debug dump after an error exit Tejun Heo
2024-05-01 15:09 ` [PATCH 21/39] tools/sched_ext: Add scx_show_state.py Tejun Heo
2024-05-01 15:09 ` [PATCH 22/39] sched_ext: Implement scx_bpf_kick_cpu() and task preemption support Tejun Heo
2024-05-01 15:09 ` [PATCH 23/39] sched_ext: Add a central scheduler which makes all scheduling decisions on one CPU Tejun Heo
2024-05-01 15:09 ` [PATCH 24/39] sched_ext: Make watchdog handle ops.dispatch() looping stall Tejun Heo
2024-05-01 15:10 ` [PATCH 25/39] sched_ext: Add task state tracking operations Tejun Heo
2024-05-01 15:10 ` [PATCH 26/39] sched_ext: Implement tickless support Tejun Heo
2024-05-01 15:10 ` [PATCH 27/39] sched_ext: Track tasks that are subjects of the in-flight SCX operation Tejun Heo
2024-05-01 15:10 ` [PATCH 28/39] sched_ext: Add cgroup support Tejun Heo
2024-05-01 15:10 ` [PATCH 29/39] sched_ext: Add a cgroup scheduler which uses flattened hierarchy Tejun Heo
2024-05-01 15:10 ` [PATCH 30/39] sched_ext: Implement SCX_KICK_WAIT Tejun Heo
2024-05-01 15:10 ` [PATCH 31/39] sched_ext: Implement sched_ext_ops.cpu_acquire/release() Tejun Heo
2024-05-01 15:10 ` [PATCH 32/39] sched_ext: Implement sched_ext_ops.cpu_online/offline() Tejun Heo
2024-05-01 15:10 ` [PATCH 33/39] sched_ext: Bypass BPF scheduler while PM events are in progress Tejun Heo
2024-05-01 15:10 ` [PATCH 34/39] sched_ext: Implement core-sched support Tejun Heo
2024-05-01 15:10 ` [PATCH 35/39] sched_ext: Add vtime-ordered priority queue to dispatch_q's Tejun Heo
2024-05-01 15:10 ` [PATCH 36/39] sched_ext: Implement DSQ iterator Tejun Heo
2024-05-01 15:10 ` [PATCH 37/39] sched_ext: Add cpuperf support Tejun Heo
2024-05-01 15:10 ` [PATCH 38/39] sched_ext: Documentation: scheduler: Document extensible scheduler class Tejun Heo
2024-05-02 2:24 ` Bagas Sanjaya
2024-05-01 15:10 ` [PATCH 39/39] sched_ext: Add selftests Tejun Heo
2024-05-02 8:48 ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Peter Zijlstra
2024-05-02 19:20 ` Tejun Heo
2024-05-03 8:52 ` Peter Zijlstra
2024-05-05 23:31 ` Tejun Heo
2024-05-13 8:03 ` Peter Zijlstra
2024-05-13 18:26 ` Steven Rostedt [this message]
2024-05-14 0:07 ` Qais Yousef
2024-05-14 21:34 ` David Vernet
2024-05-27 21:25 ` Qais Yousef
2024-05-28 23:46 ` Tejun Heo
2024-05-29 22:09 ` Qais Yousef
2024-05-17 9:58 ` Peter Zijlstra
2024-05-27 20:29 ` Qais Yousef
2024-05-14 20:22 ` Chris Mason
2024-05-14 22:06 ` Josh Don
2024-05-15 20:41 ` Tejun Heo
2024-05-21 0:19 ` Tejun Heo
2024-05-30 16:49 ` Tejun Heo
2024-05-06 18:47 ` Rik van Riel
2024-05-07 19:33 ` Tejun Heo
2024-05-07 19:47 ` Rik van Riel
2024-05-09 7:38 ` Changwoo Min
2024-05-10 18:24 ` Peter Jung
2024-05-13 20:36 ` Andrea Righi
2024-06-11 21:34 ` Linus Torvalds
2024-06-13 23:38 ` Tejun Heo
2024-06-19 20:56 ` Thomas Gleixner
2024-06-19 22:10 ` Linus Torvalds
2024-06-19 22:27 ` Thomas Gleixner
2024-06-19 22:55 ` Linus Torvalds
2024-06-20 2:35 ` Thomas Gleixner
2024-06-20 5:07 ` Linus Torvalds
2024-06-20 17:11 ` Linus Torvalds
2024-06-20 17:41 ` Tejun Heo
2024-06-20 22:15 ` [PATCH sched_ext/for-6.11] sched, sched_ext: Replace scx_next_task_picked() with sched_class->switch_class() Tejun Heo
2024-06-20 22:42 ` Linus Torvalds
2024-06-21 19:46 ` Tejun Heo
2024-06-24 9:04 ` Peter Zijlstra
2024-06-24 18:41 ` Tejun Heo
2024-06-24 9:02 ` Peter Zijlstra
2024-06-21 19:52 ` Tejun Heo
2024-06-24 8:59 ` Peter Zijlstra
2024-06-24 21:01 ` Tejun Heo
2024-06-25 7:49 ` Peter Zijlstra
2024-06-25 23:30 ` Tejun Heo
2024-06-26 8:28 ` Peter Zijlstra
2024-06-26 17:56 ` Tejun Heo
2024-06-20 18:47 ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Thomas Gleixner
2024-06-20 19:20 ` Linus Torvalds
2024-06-21 9:35 ` Thomas Gleixner
2024-06-21 16:34 ` Linus Torvalds
2024-06-23 2:00 ` Tejun Heo
2024-06-23 10:31 ` Thomas Gleixner
2024-06-23 10:33 ` Thomas Gleixner
2024-06-24 14:23 ` Jason Gunthorpe
2024-06-20 19:58 ` Tejun Heo
2024-06-24 9:34 ` Peter Zijlstra
2024-06-24 20:17 ` Tejun Heo
2024-06-24 20:51 ` [PATCH sched_ext/for-6.11] sched, sched_ext: Simplify dl_prio() case handling in sched_fork() Tejun Heo
2024-07-08 18:56 ` Tejun Heo
2024-06-20 19:35 ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Tejun Heo
2024-06-21 10:46 ` Thomas Gleixner
2024-06-21 21:14 ` Chris Mason
2024-06-23 8:14 ` Thomas Gleixner
2024-06-24 16:42 ` Chris Mason
2024-06-24 18:11 ` Tejun Heo
2024-06-24 22:01 ` Peter Oskolkov
2024-06-24 22:17 ` David Vernet
2024-06-24 21:54 ` Peter Oskolkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240513142646.4dc5484d@rorschach.local.home \
--to=rostedt@goodmis.org \
--cc=andrea.righi@canonical.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=daniel@iogearbox.net \
--cc=derkling@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=dschatzberg@meta.com \
--cc=dskarlat@cs.cmu.edu \
--cc=dvernet@meta.com \
--cc=haoluo@google.com \
--cc=himadrics@inria.fr \
--cc=joel@joelfernandes.org \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=riel@surriel.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox