All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cheng-Yang Chou <yphbchou0911@gmail.com>
To: sched-ext@lists.linux.dev, Tejun Heo <tj@kernel.org>,
	David Vernet <void@manifault.com>,
	Andrea Righi <arighi@nvidia.com>,
	Changwoo Min <changwoo@igalia.com>
Cc: Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	Chia-Ping Tsai <chia7712@gmail.com>,
	yphbchou0911@gmail.com
Subject: [PATCH] tools/sched_ext: Add scx_lib_init() to probe BPF prolog migrate_disable() behavior
Date: Fri,  3 Apr 2026 12:09:02 +0800	[thread overview]
Message-ID: <20260403040913.610756-1-yphbchou0911@gmail.com> (raw)

is_migration_disabled() uses LINUX_KERNEL_VERSION >= v6.18 to detect
whether the BPF prolog calls migrate_disable(). This is unreliable for
downstream kernels that cherry-pick 8e4f0b1ebcf2 ("bpf: use
rcu_read_lock_dont_migrate() for trampoline.c") onto a pre-v6.18 base.

Add scx_lib_init() to be called from ops.init() to probe the actual
runtime behavior. The result is stored in __scx_prolog_disables_migration
and used as a slow path in is_migration_disabled() for the pre-v6.18
!CONFIG_PREEMPT_RCU case. The two fast paths (CONFIG_PREEMPT_RCU and
v6.18+ without CONFIG_PREEMPT_RCU) are unaffected.

Call scx_lib_init() in dsp_local_on's ops.init() as the first user.

Sync with upstream scx a4863764ff55, a6101f6c277f, and 1488e1aaf659.

CC: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
---
 tools/sched_ext/include/scx/common.bpf.h      | 82 +++++++++++++++----
 .../selftests/sched_ext/dsp_local_on.bpf.c    |  6 ++
 2 files changed, 72 insertions(+), 16 deletions(-)

diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index 821d5791bd42..2c143e3d14c4 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -47,6 +47,7 @@
 extern int LINUX_KERNEL_VERSION __kconfig;
 extern const char CONFIG_CC_VERSION_TEXT[64] __kconfig __weak;
 extern const char CONFIG_LOCALVERSION[64] __kconfig __weak;
+extern bool CONFIG_PREEMPT_RCU __kconfig __weak;
 
 /*
  * Earlier versions of clang/pahole lost upper 32bits in 64bit enums which can
@@ -447,6 +448,38 @@ static __always_inline const struct cpumask *cast_mask(struct bpf_cpumask *mask)
 	return (const struct cpumask *)mask;
 }
 
+/*
+ * True if the BPF prolog (__bpf_prog_enter) calls migrate_disable() for the
+ * current task. Probed at runtime by scx_lib_init(). Defaults to true because
+ * the prolog called migrate_disable() unconditionally on kernels before v6.18,
+ * so schedulers that omit scx_lib_init() safely fall back to the original
+ * p == current disambiguation.
+ */
+static bool __scx_prolog_disables_migration = true;
+
+/*
+ * scx_lib_init - initialize the scx BPF library
+ *
+ * Must be called at the top of ops.init(). Probes runtime behavior needed by
+ * library functions such as is_migration_disabled().
+ *
+ * Returns 0 on success.
+ */
+static inline int scx_lib_init(void)
+{
+	/*
+	 * Probe whether the BPF prolog calls migrate_disable() by checking
+	 * migration_disabled of the current task. Since we are executing BPF
+	 * code right now, the prolog has already run: if it called
+	 * migrate_disable(), migration_disabled is non-zero.
+	 */
+	if (bpf_core_field_exists(((struct task_struct *)0)->migration_disabled)) {
+		const struct task_struct *p = bpf_get_current_task_btf();
+		__scx_prolog_disables_migration = p->migration_disabled > 0;
+	}
+	return 0;
+}
+
 /*
  * Return true if task @p cannot migrate to a different CPU, false
  * otherwise.
@@ -454,25 +487,42 @@ static __always_inline const struct cpumask *cast_mask(struct bpf_cpumask *mask)
 static inline bool is_migration_disabled(const struct task_struct *p)
 {
 	/*
-	 * Testing p->migration_disabled in a BPF code is tricky because the
-	 * migration is _always_ disabled while running the BPF code.
-	 * The prolog (__bpf_prog_enter) and epilog (__bpf_prog_exit) for BPF
-	 * code execution disable and re-enable the migration of the current
-	 * task, respectively. So, the _current_ task of the sched_ext ops is
-	 * always migration-disabled. Moreover, p->migration_disabled could be
-	 * two or greater when a sched_ext ops BPF code (e.g., ops.tick) is
-	 * executed in the middle of the other BPF code execution.
+	 * Testing p->migration_disabled in BPF is tricky because the BPF prolog
+	 * (__bpf_prog_enter) may call migrate_disable() for the current task,
+	 * making migration_disabled == 1 even for tasks that are not truly
+	 * migration-disabled.
+	 *
+	 * Since commit 8e4f0b1ebcf2 ("bpf: use rcu_read_lock_dont_migrate() for
+	 * trampoline.c"), the BPF prolog calls migrate_disable() only when
+	 * CONFIG_PREEMPT_RCU is enabled. Two fast paths cover the common cases:
+	 *
+	 *   1) CONFIG_PREEMPT_RCU: prolog always calls migrate_disable(), so
+	 *      migration_disabled == 1 for the current task is ambiguous.
+	 *      Disambiguate by checking p == current.
+	 *
+	 *   2) v6.18+ without CONFIG_PREEMPT_RCU: prolog never calls
+	 *      migrate_disable(), so migration_disabled == 1 is unambiguously
+	 *      a real migrate_disable() call.
 	 *
-	 * Therefore, we should decide that the _current_ task is
-	 * migration-disabled only when its migration_disabled count is greater
-	 * than one. In other words, when  p->migration_disabled == 1, there is
-	 * an ambiguity, so we should check if @p is the current task or not.
+	 * A slow path handles pre-v6.18 kernels without CONFIG_PREEMPT_RCU,
+	 * where the prolog historically called migrate_disable() unconditionally
+	 * but a cherry-picked downstream kernel may not. The runtime-probed flag
+	 * __scx_prolog_disables_migration (set by scx_lib_init()) distinguishes
+	 * the two cases without relying on the kernel version alone.
 	 */
 	if (bpf_core_field_exists(p->migration_disabled)) {
-		if (p->migration_disabled == 1)
-			return bpf_get_current_task_btf() != p;
-		else
-			return p->migration_disabled;
+		if (p->migration_disabled == 1) {
+			/* Fast path: prolog always disables migration */
+			if (CONFIG_PREEMPT_RCU)
+				return bpf_get_current_task_btf() != p;
+			/* Fast path: prolog never disables migration */
+			if (LINUX_KERNEL_VERSION >= KERNEL_VERSION(6, 18, 0))
+				return true;
+			/* Slow path: pre-v6.18, !PREEMPT_RCU - use runtime flag */
+			return __scx_prolog_disables_migration ?
+			       bpf_get_current_task_btf() != p : true;
+		}
+		return p->migration_disabled;
 	}
 	return false;
 }
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
index c02b2aa6fc64..f8c5239a9637 100644
--- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
+++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
@@ -16,6 +16,11 @@ struct {
 	__type(value, s32);
 } queue SEC(".maps");
 
+s32 BPF_STRUCT_OPS(dsp_local_on_init, bool autoload)
+{
+	return scx_lib_init();
+}
+
 s32 BPF_STRUCT_OPS(dsp_local_on_select_cpu, struct task_struct *p,
 		   s32 prev_cpu, u64 wake_flags)
 {
@@ -59,6 +64,7 @@ void BPF_STRUCT_OPS(dsp_local_on_exit, struct scx_exit_info *ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops dsp_local_on_ops = {
+	.init			= (void *) dsp_local_on_init,
 	.select_cpu		= (void *) dsp_local_on_select_cpu,
 	.enqueue		= (void *) dsp_local_on_enqueue,
 	.dispatch		= (void *) dsp_local_on_dispatch,
-- 
2.48.1


             reply	other threads:[~2026-04-03  4:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03  4:09 Cheng-Yang Chou [this message]
2026-04-03  4:09 ` [PATCH] sched_ext: Fix print_scx_info() build error on !CONFIG_EXT_SUB_SCHED kernels Cheng-Yang Chou
2026-04-03  4:14   ` Cheng-Yang Chou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403040913.610756-1-yphbchou0911@gmail.com \
    --to=yphbchou0911@gmail.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.