public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	sched-ext@lists.linux.dev, Emil Tsalapatis <emil@etsalapatis.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED
Date: Sat, 7 Mar 2026 23:36:46 +0100	[thread overview]
Message-ID: <aayofhyFFX9jng1h@gpd4> (raw)
In-Reply-To: <20260307002817.1298341-1-tj@kernel.org>

Hi Tejun,

On Fri, Mar 06, 2026 at 02:28:14PM -1000, Tejun Heo wrote:
> Hello,
> 
> SCX_ENQ_IMMED makes enqueue to local DSQs succeed only if the task can
> start running immediately - the current task is done and no other tasks are
> waiting. If the condition isn't met, the task is re-enqueued through
> ops.enqueue(). This gives the BPF scheduler tighter control over when tasks
> actually land on a CPU.

This looks interesting, but I'm trying to understand the typical use case
of this feature.

I agree that we need some kernel support to "atomically" determine when a
CPU is available (it can't be done fully in BPF). Initially I thought the
main target for ENQ_IMMED was to improve latency-sensitive workloads, but
this actually hurts latency, due to the additional re-enqueue cost and in
this case it might be better to be "less perfect" and not use ENQ_IMMED.

So I'm wondering if this feature is more focused at the multiple
sub-scheduler scenario, to prevent that a single scheduler can fill local
DSQs (effectively monopolizing a CPU while tasks sit in line). With
ENQ_IMMED, instead, we can put task on a CPU when it can run *right now*.
So the benefit is more in terms of fairness and isolation between
schedulers, rather than raw latency or throughput.

Am I understanding correctly? If that's the case it might be useful to
clarify this or describe some use cases that you have in mind.

Thanks,
-Andrea

> 
> - Patch 1 disallows setting slice to zero via scx_bpf_task_set_slice() as
>   zero slice is used by ENQ_IMMED to detect whether the current task is
>   done.
> 
> - Patch 2 implements SCX_ENQ_IMMED with reenqueue support and loop
>   detection.
> 
> - Patch 3 adds SCX_OPS_ALWAYS_ENQ_IMMED ops flag to automatically apply
>   IMMED to all local DSQ enqueues.
> 
> This patchset depends on:
> 
> - "sched_ext: Overhaul DSQ reenqueue infrastructure"
>   http://lkml.kernel.org/r/20260306190623.1076074-1-tj@kernel.org
> 
> Based on sched_ext/for-7.1 (4f8b122848db) + scx-reenq (a41719e6ae12).
> 
>  0001-sched_ext-Disallow-setting-slice-to-zero-via-scx_bpf.patch
>  0002-sched_ext-Implement-SCX_ENQ_IMMED.patch
>  0003-sched_ext-Add-SCX_OPS_ALWAYS_ENQ_IMMED-ops-flag.patch
> 
> Git tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git scx-enq-immed
> 
>  include/linux/sched/ext.h            |   3 +
>  kernel/sched/ext.c                   | 208 +++++++++++++++++++++++++++++++----
>  kernel/sched/ext_internal.h          |  43 ++++++++
>  kernel/sched/sched.h                 |   2 +
>  tools/sched_ext/include/scx/compat.h |   1 +
>  tools/sched_ext/scx_qmap.bpf.c       |   7 +-
>  tools/sched_ext/scx_qmap.c           |   9 +-
>  7 files changed, 250 insertions(+), 23 deletions(-)
> 
> --
> tejun

  parent reply	other threads:[~2026-03-07 22:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-07  0:28 [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-07  0:28 ` [PATCH 1/3] sched_ext: Disallow setting slice to zero via scx_bpf_task_set_slice() Tejun Heo
2026-03-07  0:28 ` [PATCH 2/3] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-09 17:35   ` Andrea Righi
2026-03-13 10:40     ` Tejun Heo
2026-03-13 11:11       ` Andrea Righi
2026-03-13 11:32         ` Tejun Heo
2026-03-07  0:28 ` [PATCH 3/3] sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag Tejun Heo
2026-03-07 22:36 ` Andrea Righi [this message]
2026-03-08  0:19   ` [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-08  8:54     ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aayofhyFFX9jng1h@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox