All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	sched-ext@lists.linux.dev, Emil Tsalapatis <emil@etsalapatis.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET v2 sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED
Date: Fri, 13 Mar 2026 20:21:53 +0100	[thread overview]
Message-ID: <abRj0b4CKMnuwjos@gpd4> (raw)
In-Reply-To: <20260313113114.1591010-1-tj@kernel.org>

On Fri, Mar 13, 2026 at 01:31:08AM -1000, Tejun Heo wrote:
> Hello,
> 
> Currently, BPF schedulers that want to ensure tasks don't linger on local
> DSQs behind other tasks or on CPUs taken by higher-priority scheduling
> classes must resort to hooking the sched_switch tracepoint or implementing
> the now-deprecated ops.cpu_acquire/release(). Both approaches are cumbersome
> and partial - sched_switch doesn't handle cases where a local DSQ ends up
> with multiple tasks queued, which can be difficult to control perfectly.
> cpu_release() is even more limited, missing cases like a higher-priority
> task waking up while an idle CPU is waking up to an SCX task. Neither can
> atomically determine whether a CPU is truly available at the moment of
> dispatch.
> 
> SCX_ENQ_IMMED replaces these with a single dispatch flag that provides a
> kernel-enforced guarantee: a task dispatched with IMMED either gets on the
> CPU immediately, or gets reenqueued back to the BPF scheduler. It will never
> linger on a local DSQ behind other tasks or be silently put back after
> preemption. This gives BPF schedulers comprehensive latency control directly
> in the dispatch path.
> 
> The protection is persistent - it survives SAVE/RESTORE cycles, slice
> extensions and higher-priority class preemptions. If an IMMED task is
> preempted while running, it gets reenqueued through ops.enqueue() with
> SCX_TASK_REENQ_PREEMPTED instead of silently placed back on the local DSQ.
> 
> This also enables opportunistic CPU sharing across sub-schedulers. Without
> IMMED, a sub-scheduler can stuff the local DSQ of a shared CPU, making it
> difficult for others to use. With IMMED, tasks only stay on a CPU when they
> can actually run, keeping CPUs available for other schedulers.
> 
> Patches 1-2 are prep refactoring. Patch 3 implements SCX_ENQ_IMMED. Patches
> 4-5 plumb enq_flags through the consume and move_to_local paths so IMMED
> works on those paths too. Patch 6 adds SCX_OPS_ALWAYS_ENQ_IMMED.
> 
> v2: - Split prep patches out of main IMMED patch (#1, #2).
>     - Rewrite is_curr_done() as rq_is_open() using rq->next_class and
>       implement wakeup_preempt_scx() for complete higher-class preemption
>       coverage (#3).
>     - Track IMMED persistently in p->scx.flags and reenqueue
>       preempted-while-running tasks through ops.enqueue() (#3).
>     - Drop "disallow setting slice to zero" patch - no longer needed with
>       rq_is_open() approach.
>     - Plumb enq_flags through consume and move_to_local paths (#4, #5).
>     - Cover scx_bpf_dsq_move_to_local() in OPS_ALWAYS_IMMED (#6).
>     - Remove obsolete sched_switch tracepoint and cpu_release handlers
>       from scx_qmap, add IMMED stress test (#6) (Andrea Righi).
> 
> v1: https://lore.kernel.org/r/20260307002817.1298341-1-tj@kernel.org

Only found a small typo in patch 3, everything else looks good to me.

Reviewed-by: Andrea Righi <arighi@nvidia.com>

Thanks,
-Andrea

  parent reply	other threads:[~2026-03-13 19:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13 11:31 [PATCHSET v2 sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-13 11:31 ` [PATCH 1/6] sched_ext: Split task_should_reenq() into local and user variants Tejun Heo
2026-03-13 11:31 ` [PATCH 2/6] sched_ext: Add scx_vet_enq_flags() and plumb dsq_id into preamble Tejun Heo
2026-03-13 11:31 ` [PATCH 3/6] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo
2026-03-13 19:15   ` Andrea Righi
2026-03-13 11:31 ` [PATCH 4/6] sched_ext: Plumb enq_flags through the consume path Tejun Heo
2026-03-13 11:31 ` [PATCH 5/6] sched_ext: Add enq_flags to scx_bpf_dsq_move_to_local() Tejun Heo
2026-03-13 11:31 ` [PATCH 6/6] sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag Tejun Heo
2026-03-13 18:37 ` [PATCH 7/6 sched_ext/for-7.1] sched_ext: Use schedule_deferred_locked() in schedule_dsq_reenq() Tejun Heo
2026-03-13 19:21 ` Andrea Righi [this message]
2026-03-13 19:45 ` [PATCHSET v2 sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abRj0b4CKMnuwjos@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.