public inbox for sched-ext@lists.linux.dev
 help / color / mirror / Atom feed
From: Matt Fleming <matt@readmodwrite.com>
To: Tejun Heo <tj@kernel.org>
Cc: sched-ext@lists.linux.dev, kernel-team@cloudflare.com,
	 arighi@nvidia.com, void@manifault.com, changwoo@igalia.com,
	peterz@infradead.org,  linux-kernel@vger.kernel.org
Subject: Re: sched_ext: Partial mode priority and fallthrough to EEVDF
Date: Wed, 11 Mar 2026 11:10:11 +0000	[thread overview]
Message-ID: <abFEFlivu8vk5pex@matt-Precision-5490> (raw)
In-Reply-To: <abBidFKC2Hp6Wca-@slm.duckdns.org>

On Tue, Mar 10, 2026 at 08:27:00AM -1000, Tejun Heo wrote:
> 
> Hmm... I have a bit of hard time following how that's different from partial
> mode. If you want the scheduler to decide whether a task should be in SCX or
> fair, you can do so from ops.init_task() by asserting p->scx.disallow. If
> you mean that you want to switch dynamically on each scheduling event, I
> don't think that's a good idea given that each hop would be full sched_class
> switch.
 
Oh no, I don't want to switch dynamically at runtime. Doing the
classification once at BPF program load time is fine, but AFAIU
p->scx.disallow still gives us two scheduling classes (SCHED_EXT and
SCHED_NORMAL) where tasks in the fair class get chosen first.

> As for the ordering between the two, I don't know. How are you using partial
> mode? No matter how you order them, the behaviors on pathological cases are
> pretty bad and I've been thinking that most would use partial mode to
> partition the system so that some CPUs are managed by SCX and others by fair
> in which case the ordering doesn't matter that much. If you're mixing the
> two classes on the same CPUs, I wonder whether this is something which can
> be better dealt with the deadline servers. Andrea, what do you think?

I want to use SCHED_EXT to schedule the most latency-critical tasks
because a custom BPF scheduler allows me to make better CPU placement
and preemption decisions. Doing it with partial mode allows me to
progressively switch services over to SCHED_EXT without needing to take
on a mass migration for 100+ services in one go (something I'm trying
to my hardest to avoid :) ).

To clarify my "fallthrough to EEVDF" comment: if I could run in
full-mode, use disallow to keep most tasks EEVDF, and have SCHED_EXT
tasks scheduled with higher priority than SCHED_NORMAL then this would
tick all the boxes.

I have experimented with isolating CPUs where all tasks running are
SCHED_EXT while other CPUs run the SCHED_NORMAL workloads, so that's a
possibility. But not all our servers are configured that way and given
that we run heterogeneous workloads on single machines, it's a tall
price to pay capacity-wise if we can't fully utilise those isolated
CPUs at all times.

And to limit the pathological case in my experiments so far I'm using
cpu.max to cap CPU bandwidth (thanks to scx_lavd's bandwidth support).
All our services are systemd services, so we can set limits to guard
against complete meltdowns.

Thanks for the tip on the DL server. This looks promising and might
solve my problem nicely. I'll reply in more detail to Andrea's post.

      parent reply	other threads:[~2026-03-11 11:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 14:52 sched_ext: Partial mode priority and fallthrough to EEVDF Matt Fleming
2026-03-10 18:27 ` Tejun Heo
2026-03-10 18:46   ` Andrea Righi
2026-03-11 11:22     ` Matt Fleming
2026-03-11 11:10   ` Matt Fleming [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abFEFlivu8vk5pex@matt-Precision-5490 \
    --to=matt@readmodwrite.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox