bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>,
	mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, vschneid@redhat.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org,
	joshdon@google.com, brho@google.com, pjt@google.com,
	derkling@google.com, haoluo@google.com, dvernet@meta.com,
	dschatzberg@meta.com, dskarlat@cs.cmu.edu, riel@surriel.com,
	changwoo@igalia.com, himadrics@inria.fr, memxor@gmail.com,
	andrea.righi@canonical.com, joel@joelfernandes.org,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [PATCHSET v6] sched: Implement BPF extensible scheduler class
Date: Thu, 20 Jun 2024 04:35:08 +0200	[thread overview]
Message-ID: <878qz0pcir.ffs@tglx> (raw)
In-Reply-To: <CAHk-=wiKgKpNA6Dv7zoLHATweM-nEYWeXeFdS03wUQ8-V4wFxg@mail.gmail.com>

Linus!

On Wed, Jun 19 2024 at 15:55, Linus Torvalds wrote:
> On Wed, 19 Jun 2024 at 15:27, Thomas Gleixner <tglx@linutronix.de> wrote:
>> Right, but that applies to both sides, no?
>
> But Thomas, this isn't a "both sides" issue.

It is whether you like it or not.

I clearly gave the sched_ext people deep technical feedback on the way
how they integrated this seven months ago in Richmond.

I sat down with them because _you_ asked me explicitely for it as _you_
did not want to deal with it.

That feedback got ignored completely. Thank you very much for wasting
_my_ time.

> This is a "people want to do new code and features, and the scheduler
> people ARE ACTIVELY HOLDING IT UP" issue.
>
> Yes, part of that "actively holding it up" is trying to make rules for
> "you need to do this other XYZ thing to make us happy".
>
> But no, then "not doing XYZ" does *NOT* make it some "but but other side" issue.
>
> This, btw, is not some new thing. It's something that has been
> discussed multiple times over the years at the maintainer summit for
> different maintainers. When people come in and propose feature X, it's
> not kosher to then say "you have to do Y first".

Seriously? You yourself requested this from me more than once in the
past 20 years. But let's not go there.

> And yes, maybe everybody even agrees that Y would be a good thing, and
> yes, wouldn't it be lovely if somebody did it. But the people who
> wanted X didn't care about Y, and trying to get Y done by then gating
> X is simply not ok.

I very well remember the discussions about fix X first before Y.

But you are completely ignoring the fact that in this case

    - problem X got introduced by the very same people who are now
      pushing for Y

    - the resolution to problem X was not rejected by the scheduler
      people at all

      It stayed unresolved because the very same parties which are now
      so cosily working together on Y (aka sched_ext0 could not agree on
      anything

      So the mess they created in the first place stays as is and both
      parties "solved" their problem with Y (aka sched_ext) and have
      their own implementations to work around it how they see fit.

IOW, what you are saying is:

    #1 Push for your solution X and get it merged

    #2 Ignore the resulting problems related to X for at least a decade

    #3 Come up with a half baken workaround Y to be able to deal with #2
       and leave everyone else behind to deal with the results of #1 and
       #2

    #4 Make Linus decide that Y is the right thing to do because other
       people suck

The past discussions about fix "X" before "Y" were distinctly different.

> Now, if there was some technical argument against X itself, that would
> be one thing. But the arguments I've heard have basically fallen into
> two camps: the political one ("We don't want to do X because we simply
> don't want an extensible scheduler, because we want people to work on
> _our_ scheduler") and the tying one ("X is ok but we want Y solved
> first").

There have been voiced a lot of technical arguments, which never got
addressed and at some point people gave up due to being ignored.

When I sat there in Richmond with the sched_ext people I gave them very
deep technical feedback especially on the way how they integrate it:

  Sprinkle hooks and callbacks all over the place until it works by some
  definition of works.

That's perfectly fine for a PoC, but not for something which gets merged
into the core of an OS. I clearly asked them to refactor the existing
code so that these warts go away and everything is contained into the
scheduler classes and at the very end sched_ext falls into place. That's
a basic engineering principle as far as I know.

They nodded, ignored my feedback and just continued to persue their way.

Are you still claiming with a straight face that this is a problem
rooted in one party?

It's a problem of one interest group to get this into the tree no matter
what.

I sat in the room at OSPM 2023 when the introduction of the sched ext
advertisement started with "This saves us (FB) millions and Google is
collaborating with us [unspoken - to save millions too by eliminating
their unmaintainable scheduler hacks]', which is clearly a technical
argument, right?

> I was hoping the tying argument would get solved. I saw a couple of
> half-hearted emails to that effect, and Rik at some point saying
> "maybe the problems are solvable", referring to his work from a couple
> of years ago, but again, nothing actually happened.

See above. It's not a problem of the scheduler people.

It's a problem created by the very same people who refuse to solve it
and at the same time push for their new magic cure.

> And I don't see the argument that the way to make something happen is
> to continue to do nothing.

I clearly offered you to try to resolve this amicably within a
reasonable time frame.

How exaclty is that equivalent to "continue to do nothing" ?

> Because if you are serious about making forward progress *with* the
> BPF extensions, why not merge them and actually work with that as the
> base?

There is a very simple argument to this:

   - The way how it is integrated sucks in a big way on purely technical
     grounds. If you want details, I'm happy to reiterate them again in
     a separate mail just in case you can't find them in your inbox or
     can't remember what I told you seven month ago.

   - This got pointed out by myself seven months ago to the sched_ext
     people

   - They sat with me at the table and nodded

   - Then they went off and ignored it for seven months and just added
     more warts.

Now you are asking me seriously why I don't want to see this merged in
the technical state which it is in right now?

Are you seriously expecting that this is resolved amicably just by
forcing this into the tree on the obvious expense of the people who were
working on the existing code for decades?

That aside, your whole argument about the so "very high" participation
barrier on the scheduler is just based on hearsay and not on facts,
which you could easily have gathered yourself. Let me do your homework.

In the past five years there have been:

   ~4000 mail (patch) threads related to scheduler issues
   ~1600 commits (i.e. 1.6 commits per work day on average)
    ~300 different authors

300 different authors is clearly not a sign of a problematic community
neither is the ratio of commits to mail threads is 1:2.5.

The fact tell clearly that this is a healthy community and not a sign of
a participation barrier problem.

The fact, that the contributions and contribution attempts from the
proponents of sched_ext are close to zero cannot be abused to claim that
there is a high bar to get patches into the scheduler subsystem.

If you don't try in the first place then you can't complain about it,
no?

Just for the record, the scheduler people and myself spent a lot of time
to help to get intrusive features like UMCG into mainline, but the
efforts were dropped by the submitters for no reason. Short time after
that sched_ext came around.

Can we please agree that the root of this has left the technical grounds
long ago?

If you think that the correct non-technical solution is to resolve this
brute force without giving those who are willing to work this out in the
proper way a completely irrelevant delay of three month, then I really
have to ask you what you are trying to achive.

If you pull that stuff as is then you create a patently bad precedent
and on top of that you slap everyone who worked and works
collaboratively and constructively with maintainers and the wider
community to get their features merged straight into their face.

You are obviously free to do so, but then please clearly state that this
is the new world order by merging an unreviewed patch against
Documentation/process/* which makes this as a general rule applicable
for everyone.

We've been there and done that during the 2.5 period and it took us
years to recover from it, but maybe you have forgotten about that
because you merely had to merge the fixes which were created by people
who cared and lost their sleep over the mess.

Whatever the outcome is, we definitely have to have a major discussion
about the underlying problem at the maintainer summit whether you like
it or not.

That said, my offer stands to work on an amicable and collaborative
solution for this nuisance, but that's obviously all I can do.

Up to you.

Thanks,

        Thomas

  reply	other threads:[~2024-06-20  2:35 UTC|newest]

Thread overview: 141+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-01 15:09 [PATCHSET v6] sched: Implement BPF extensible scheduler class Tejun Heo
2024-05-01 15:09 ` [PATCH 01/39] cgroup: Implement cgroup_show_cftypes() Tejun Heo
2024-05-01 15:09 ` [PATCH 02/39] sched: Restructure sched_class order sanity checks in sched_init() Tejun Heo
2024-05-01 15:09 ` [PATCH 03/39] sched: Allow sched_cgroup_fork() to fail and introduce sched_cancel_fork() Tejun Heo
2024-05-01 15:09 ` [PATCH 04/39] sched: Add sched_class->reweight_task() Tejun Heo
2024-06-24 10:23   ` Peter Zijlstra
2024-06-24 10:31     ` Peter Zijlstra
2024-06-24 23:59     ` Tejun Heo
2024-06-25  7:29       ` Peter Zijlstra
2024-06-25 23:57         ` Tejun Heo
2024-06-26  1:29           ` [PATCH sched/urgent] sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks Tejun Heo
2024-06-26  2:19           ` [PATCH sched_ext/for-6.11] sched_ext: Account for idle policy when setting p->scx.weight in scx_ops_enable_task() Tejun Heo
2024-07-08 19:29             ` [PATCH v2 " Tejun Heo
2024-05-01 15:09 ` [PATCH 05/39] sched: Add sched_class->switching_to() and expose check_class_changing/changed() Tejun Heo
2024-06-24 11:06   ` Peter Zijlstra
2024-06-24 22:18     ` Tejun Heo
2024-06-25  8:16       ` Peter Zijlstra
2024-05-01 15:09 ` [PATCH 06/39] sched: Factor out cgroup weight conversion functions Tejun Heo
2024-05-01 15:09 ` [PATCH 07/39] sched: Expose css_tg() and __setscheduler_prio() Tejun Heo
2024-06-24 11:19   ` Peter Zijlstra
2024-06-24 18:56     ` Tejun Heo
2024-05-01 15:09 ` [PATCH 08/39] sched: Enumerate CPU cgroup file types Tejun Heo
2024-05-01 15:09 ` [PATCH 09/39] sched: Add @reason to sched_class->rq_{on|off}line() Tejun Heo
2024-06-24 11:32   ` Peter Zijlstra
2024-06-24 21:18     ` Tejun Heo
2024-06-25  8:29       ` Peter Zijlstra
2024-06-25 23:41         ` Tejun Heo
2024-06-26  8:23           ` Peter Zijlstra
2024-06-26 18:01             ` Tejun Heo
2024-06-27  1:27               ` [PATCH sched_ext/for-6.11] sched_ext: Disallow loading BPF scheduler if isolcpus= domain isolation is in effect Tejun Heo
2024-07-08 19:30                 ` Tejun Heo
2024-05-01 15:09 ` [PATCH 10/39] sched: Factor out update_other_load_avgs() from __update_blocked_others() Tejun Heo
2024-06-24 12:35   ` Peter Zijlstra
2024-06-24 16:15     ` Vincent Guittot
2024-06-24 19:24       ` Tejun Heo
2024-06-25  9:13         ` Vincent Guittot
2024-06-26 20:49           ` Tejun Heo
2024-05-01 15:09 ` [PATCH 11/39] cpufreq_schedutil: Refactor sugov_cpu_is_busy() Tejun Heo
2024-05-01 15:09 ` [PATCH 12/39] sched: Add normal_policy() Tejun Heo
2024-05-01 15:09 ` [PATCH 13/39] sched_ext: Add boilerplate for extensible scheduler class Tejun Heo
2024-05-01 15:09 ` [PATCH 14/39] sched_ext: Implement BPF " Tejun Heo
2024-05-01 15:09 ` [PATCH 15/39] sched_ext: Add scx_simple and scx_example_qmap example schedulers Tejun Heo
2024-05-01 15:09 ` [PATCH 16/39] sched_ext: Add sysrq-S which disables the BPF scheduler Tejun Heo
2024-05-01 15:09 ` [PATCH 17/39] sched_ext: Implement runnable task stall watchdog Tejun Heo
2024-05-01 15:09 ` [PATCH 18/39] sched_ext: Allow BPF schedulers to disallow specific tasks from joining SCHED_EXT Tejun Heo
2024-06-24 12:40   ` Peter Zijlstra
2024-06-24 19:06     ` Tejun Heo
2024-05-01 15:09 ` [PATCH 19/39] sched_ext: Print sched_ext info when dumping stack Tejun Heo
2024-06-24 12:46   ` Peter Zijlstra
2024-06-24 14:25     ` Linus Torvalds
2024-06-24 18:34     ` Tejun Heo
2024-05-01 15:09 ` [PATCH 20/39] sched_ext: Print debug dump after an error exit Tejun Heo
2024-05-01 15:09 ` [PATCH 21/39] tools/sched_ext: Add scx_show_state.py Tejun Heo
2024-05-01 15:09 ` [PATCH 22/39] sched_ext: Implement scx_bpf_kick_cpu() and task preemption support Tejun Heo
2024-05-01 15:09 ` [PATCH 23/39] sched_ext: Add a central scheduler which makes all scheduling decisions on one CPU Tejun Heo
2024-05-01 15:09 ` [PATCH 24/39] sched_ext: Make watchdog handle ops.dispatch() looping stall Tejun Heo
2024-05-01 15:10 ` [PATCH 25/39] sched_ext: Add task state tracking operations Tejun Heo
2024-05-01 15:10 ` [PATCH 26/39] sched_ext: Implement tickless support Tejun Heo
2024-05-01 15:10 ` [PATCH 27/39] sched_ext: Track tasks that are subjects of the in-flight SCX operation Tejun Heo
2024-05-01 15:10 ` [PATCH 28/39] sched_ext: Add cgroup support Tejun Heo
2024-05-01 15:10 ` [PATCH 29/39] sched_ext: Add a cgroup scheduler which uses flattened hierarchy Tejun Heo
2024-05-01 15:10 ` [PATCH 30/39] sched_ext: Implement SCX_KICK_WAIT Tejun Heo
2024-05-01 15:10 ` [PATCH 31/39] sched_ext: Implement sched_ext_ops.cpu_acquire/release() Tejun Heo
2024-05-01 15:10 ` [PATCH 32/39] sched_ext: Implement sched_ext_ops.cpu_online/offline() Tejun Heo
2024-05-01 15:10 ` [PATCH 33/39] sched_ext: Bypass BPF scheduler while PM events are in progress Tejun Heo
2024-05-01 15:10 ` [PATCH 34/39] sched_ext: Implement core-sched support Tejun Heo
2024-05-01 15:10 ` [PATCH 35/39] sched_ext: Add vtime-ordered priority queue to dispatch_q's Tejun Heo
2024-05-01 15:10 ` [PATCH 36/39] sched_ext: Implement DSQ iterator Tejun Heo
2024-05-01 15:10 ` [PATCH 37/39] sched_ext: Add cpuperf support Tejun Heo
2024-05-01 15:10 ` [PATCH 38/39] sched_ext: Documentation: scheduler: Document extensible scheduler class Tejun Heo
2024-05-02  2:24   ` Bagas Sanjaya
2024-05-01 15:10 ` [PATCH 39/39] sched_ext: Add selftests Tejun Heo
2024-05-02  8:48 ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Peter Zijlstra
2024-05-02 19:20   ` Tejun Heo
2024-05-03  8:52     ` Peter Zijlstra
2024-05-05 23:31       ` Tejun Heo
2024-05-13  8:03         ` Peter Zijlstra
2024-05-13 18:26           ` Steven Rostedt
2024-05-14  0:07             ` Qais Yousef
2024-05-14 21:34               ` David Vernet
2024-05-27 21:25                 ` Qais Yousef
2024-05-28 23:46                   ` Tejun Heo
2024-05-29 22:09                     ` Qais Yousef
2024-05-17  9:58               ` Peter Zijlstra
2024-05-27 20:29                 ` Qais Yousef
2024-05-14 20:22           ` Chris Mason
2024-05-14 22:06           ` Josh Don
2024-05-15 20:41           ` Tejun Heo
2024-05-21  0:19             ` Tejun Heo
2024-05-30 16:49               ` Tejun Heo
2024-05-06 18:47       ` Rik van Riel
2024-05-07 19:33         ` Tejun Heo
2024-05-07 19:47           ` Rik van Riel
2024-05-09  7:38       ` Changwoo Min
2024-05-10 18:24 ` Peter Jung
2024-05-13 20:36 ` Andrea Righi
2024-06-11 21:34 ` Linus Torvalds
2024-06-13 23:38   ` Tejun Heo
2024-06-19 20:56   ` Thomas Gleixner
2024-06-19 22:10     ` Linus Torvalds
2024-06-19 22:27       ` Thomas Gleixner
2024-06-19 22:55         ` Linus Torvalds
2024-06-20  2:35           ` Thomas Gleixner [this message]
2024-06-20  5:07             ` Linus Torvalds
2024-06-20 17:11               ` Linus Torvalds
2024-06-20 17:41                 ` Tejun Heo
2024-06-20 22:15                   ` [PATCH sched_ext/for-6.11] sched, sched_ext: Replace scx_next_task_picked() with sched_class->switch_class() Tejun Heo
2024-06-20 22:42                     ` Linus Torvalds
2024-06-21 19:46                       ` Tejun Heo
2024-06-24  9:04                         ` Peter Zijlstra
2024-06-24 18:41                           ` Tejun Heo
2024-06-24  9:02                       ` Peter Zijlstra
2024-06-21 19:52                     ` Tejun Heo
2024-06-24  8:59                     ` Peter Zijlstra
2024-06-24 21:01                       ` Tejun Heo
2024-06-25  7:49                         ` Peter Zijlstra
2024-06-25 23:30                           ` Tejun Heo
2024-06-26  8:28                             ` Peter Zijlstra
2024-06-26 17:56                               ` Tejun Heo
2024-06-20 18:47               ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Thomas Gleixner
2024-06-20 19:20                 ` Linus Torvalds
2024-06-21  9:35                   ` Thomas Gleixner
2024-06-21 16:34                     ` Linus Torvalds
2024-06-23  2:00                       ` Tejun Heo
2024-06-23 10:31                       ` Thomas Gleixner
2024-06-23 10:33                       ` Thomas Gleixner
2024-06-24 14:23                         ` Jason Gunthorpe
2024-06-20 19:58                 ` Tejun Heo
2024-06-24  9:34                   ` Peter Zijlstra
2024-06-24 20:17                     ` Tejun Heo
2024-06-24 20:51                       ` [PATCH sched_ext/for-6.11] sched, sched_ext: Simplify dl_prio() case handling in sched_fork() Tejun Heo
2024-07-08 18:56                         ` Tejun Heo
2024-06-20 19:35             ` [PATCHSET v6] sched: Implement BPF extensible scheduler class Tejun Heo
2024-06-21 10:46               ` Thomas Gleixner
2024-06-21 21:14                 ` Chris Mason
2024-06-23  8:14                   ` Thomas Gleixner
2024-06-24 16:42                     ` Chris Mason
2024-06-24 18:11                       ` Tejun Heo
2024-06-24 22:01                         ` Peter Oskolkov
2024-06-24 22:17                     ` David Vernet
2024-06-24 21:54             ` Peter Oskolkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878qz0pcir.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=andrea.righi@canonical.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brho@google.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=changwoo@igalia.com \
    --cc=daniel@iogearbox.net \
    --cc=derkling@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dschatzberg@meta.com \
    --cc=dskarlat@cs.cmu.edu \
    --cc=dvernet@meta.com \
    --cc=haoluo@google.com \
    --cc=himadrics@inria.fr \
    --cc=joel@joelfernandes.org \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=riel@surriel.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).