All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	sched-ext@lists.linux.dev, Emil Tsalapatis <emil@etsalapatis.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED
Date: Sat, 7 Mar 2026 23:36:46 +0100	[thread overview]
Message-ID: <aayofhyFFX9jng1h@gpd4> (raw)
In-Reply-To: <20260307002817.1298341-1-tj@kernel.org>

Hi Tejun,

On Fri, Mar 06, 2026 at 02:28:14PM -1000, Tejun Heo wrote:
> Hello,
> 
> SCX_ENQ_IMMED makes enqueue to local DSQs succeed only if the task can
> start running immediately - the current task is done and no other tasks are
> waiting. If the condition isn't met, the task is re-enqueued through
> ops.enqueue(). This gives the BPF scheduler tighter control over when tasks
> actually land on a CPU.

This looks interesting, but I'm trying to understand the typical use case
of this feature.

I agree that we need some kernel support to "atomically" determine when a
CPU is available (it can't be done fully in BPF). Initially I thought the
main target for ENQ_IMMED was to improve latency-sensitive workloads, but
this actually hurts latency, due to the additional re-enqueue cost and in
this case it might be better to be "less perfect" and not use ENQ_IMMED.

So I'm wondering if this feature is more focused at the multiple
sub-scheduler scenario, to prevent that a single scheduler can fill local
DSQs (effectively monopolizing a CPU while tasks sit in line). With
ENQ_IMMED, instead, we can put task on a CPU when it can run *right now*.
So the benefit is more in terms of fairness and isolation between
schedulers, rather than raw latency or throughput.

Am I understanding correctly? If that's the case it might be useful to
clarify this or describe some use cases that you have in mind.

Thanks,
-Andrea

> 
> - Patch 1 disallows setting slice to zero via scx_bpf_task_set_slice() as
>   zero slice is used by ENQ_IMMED to detect whether the current task is
>   done.
> 
> - Patch 2 implements SCX_ENQ_IMMED with reenqueue support and loop
>   detection.
> 
> - Patch 3 adds SCX_OPS_ALWAYS_ENQ_IMMED ops flag to automatically apply
>   IMMED to all local DSQ enqueues.
> 
> This patchset depends on:
> 
> - "sched_ext: Overhaul DSQ reenqueue infrastructure"
>   http://lkml.kernel.org/r/20260306190623.1076074-1-tj@kernel.org
> 
> Based on sched_ext/for-7.1 (4f8b122848db) + scx-reenq (a41719e6ae12).
> 
>  0001-sched_ext-Disallow-setting-slice-to-zero-via-scx_bpf.patch
>  0002-sched_ext-Implement-SCX_ENQ_IMMED.patch
>  0003-sched_ext-Add-SCX_OPS_ALWAYS_ENQ_IMMED-ops-flag.patch
> 
> Git tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git scx-enq-immed
> 
>  include/linux/sched/ext.h            |   3 +
>  kernel/sched/ext.c                   | 208 +++++++++++++++++++++++++++++++----
>  kernel/sched/ext_internal.h          |  43 ++++++++
>  kernel/sched/sched.h                 |   2 +
>  tools/sched_ext/include/scx/compat.h |   1 +
>  tools/sched_ext/scx_qmap.bpf.c       |   7 +-
>  tools/sched_ext/scx_qmap.c           |   9 +-
>  7 files changed, 250 insertions(+), 23 deletions(-)
> 
> --
> tejun

  parent reply	other threads:[~2026-03-07 22:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-07  0:28 [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-07  0:28 ` [PATCH 1/3] sched_ext: Disallow setting slice to zero via scx_bpf_task_set_slice() Tejun Heo
2026-03-07  0:28 ` [PATCH 2/3] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-09 17:35   ` Andrea Righi
2026-03-13 10:40     ` Tejun Heo
2026-03-13 11:11       ` Andrea Righi
2026-03-13 11:32         ` Tejun Heo
2026-03-07  0:28 ` [PATCH 3/3] sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag Tejun Heo
2026-03-07 22:36 ` Andrea Righi [this message]
2026-03-08  0:19   ` [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-08  8:54     ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aayofhyFFX9jng1h@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.