All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuba Piecuch <jpiecuch@google.com>
To: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	 Andrea Righi <andrea.righi@linux.dev>,
	Changwoo Min <changwoo@igalia.com>
Cc: <linux-kernel@vger.kernel.org>, <sched-ext@lists.linux.dev>,
	 Peter Zijlstra <peterz@infradead.org>,
	Wen-Fang Liu <liuwenfang@honor.com>
Subject: Re: [PATCH v3 3/3] sched_ext: Allow scx_bpf_reenqueue_local() to be called from anywhere
Date: Thu, 11 Dec 2025 14:24:04 +0000	[thread overview]
Message-ID: <DEVGF29ONM1A.CCTFEV5OZI9D@google.com> (raw)
In-Reply-To: <7e9b27d70e31c243da3ce77e622b0af5@kernel.org>

Hi Tejun,

I think with the proposed implementation, using scx_bpf_reenqueue_local()
from arbitrary contexts can have highly non-intuitive effects.

For example, consider ops.enqueue() for a hypothetical userspace scheduler:

void BPF_STRUCT_OPS(example_enqueue, struct task_struct *p, u64 enq_flags)
{
	if (p->pid == user_scheduler_pid()) {
		/*
		 * Remove existing tasks from the local DSQ so that
		 * the userspace scheduler can schedule different tasks
		 * before them.
		 */
		scx_bpf_reenqueue_local();
		/*
		 * Dispatch the user scheduler directly to the local DSQ.
		 */
		scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0);
	}
	...
}

I'm not arguing this is the way it should be written, but AFAIK it's perfectly
legal.

Since we're doing a direct dispatch, the user scheduler task will be
inserted into the dispatch queue in enable_task_scx(), without dropping the rq
lock between example_enqueue() and the insertion, which means reenq_local()
will run afterwards (since it's deferred using irq_work), removing all tasks
from the DSQ, including the userspace scheduler.

A similar problem arises even if we don't do direct dispatch and drop the rq
lock after example_enqueue(): since dispatching and reenq_local() are deferred
using different irq_work entries, and irq_work_run() processes entries from
newest to oldest, dispatching will be handled before reenq_local(), yielding
the same result.

The user may be unaware of this behavior (it's not mentioned anywhere) and
expect the reenqueue to happen before dispatching the new task.

I think at the very least we should make users aware of this in the comment
for scx_bpf_reenqueue_local___v2().

Best,
Kuba

  parent reply	other threads:[~2025-12-11 14:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-25  0:18 [PATCHSET sched_ext/for-6.19] sched_ext: Deprecate ops.cpu_acquire/release() Tejun Heo
2025-10-25  0:18 ` [PATCH 1/3] sched_ext: Split schedule_deferred() into locked and unlocked variants Tejun Heo
2025-10-25 23:17   ` Emil Tsalapatis
2025-10-25  0:18 ` [PATCH 2/3] sched_ext: Factor out reenq_local() from scx_bpf_reenqueue_local() Tejun Heo
2025-10-25 23:19   ` Emil Tsalapatis
2025-10-25  0:18 ` [PATCH 3/3] sched_ext: Allow scx_bpf_reenqueue_local() to be called from anywhere Tejun Heo
2025-10-25 23:21   ` Emil Tsalapatis
2025-10-27  9:18   ` Peter Zijlstra
2025-10-27 16:00     ` Tejun Heo
2025-10-27 17:49       ` Peter Zijlstra
2025-10-27 18:05         ` Tejun Heo
2025-10-27 18:07           ` Peter Zijlstra
2025-10-27 18:10       ` Peter Zijlstra
2025-10-27 18:17         ` Tejun Heo
2025-10-28 11:01           ` Peter Zijlstra
2025-10-28 17:07             ` Tejun Heo
2025-10-27 18:19   ` [PATCH v2 " Tejun Heo
2025-10-29 10:45     ` Peter Zijlstra
2025-10-29 15:11       ` Tejun Heo
2025-10-29 15:49     ` [PATCH v3 " Tejun Heo
2025-11-27 10:39       ` Kuba Piecuch
2025-12-02 23:05         ` Tejun Heo
2025-12-11 14:24       ` Kuba Piecuch [this message]
2025-12-11 16:17         ` Tejun Heo
2025-12-11 16:20           ` Tejun Heo
2025-12-13  1:16             ` Andrea Righi
2025-12-13  1:18               ` Tejun Heo
2025-10-29 15:31 ` [PATCHSET sched_ext/for-6.19] sched_ext: Deprecate ops.cpu_acquire/release() Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DEVGF29ONM1A.CCTFEV5OZI9D@google.com \
    --to=jpiecuch@google.com \
    --cc=andrea.righi@linux.dev \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liuwenfang@honor.com \
    --cc=peterz@infradead.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.