public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	sched-ext@lists.linux.dev, Emil Tsalapatis <emil@etsalapatis.com>,
	linux-kernel@vger.kernel.org,
	Cheng-Yang Chou <yphbchou0911@gmail.com>
Subject: Re: [PATCH 08/17] sched_ext: Add scx_bpf_cid_override() kfunc
Date: Wed, 29 Apr 2026 16:07:20 +0200	[thread overview]
Message-ID: <afIQmB4H6li6jL4I@gpd4> (raw)
In-Reply-To: <20260428203545.181052-9-tj@kernel.org>

Hi Tejun,

On Tue, Apr 28, 2026 at 10:35:36AM -1000, Tejun Heo wrote:
> The auto-probed cid mapping reflects the kernel's view of topology
> (node -> LLC -> core), but a BPF scheduler may want a different layout -
> to align cid slices with its own partitioning, or to work around how the
> kernel reports a particular machine.
> 
> Add scx_bpf_cid_override(), callable from ops.init() of the root
> scheduler. It validates the caller-supplied cpu->cid array and replaces
> the in-place mapping; topo info is invalidated. A compat.bpf.h wrapper
> silently no-ops on kernels that lack the kfunc.
> 
> A new SCX_KF_ALLOW_INIT bit in the kfunc context filter restricts the
> kfunc to ops.init() at verifier load time.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reviewed-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
...
> +/**
> + * scx_bpf_cid_override - Install an explicit cpu->cid mapping
> + * @cpu_to_cid: array of nr_cpu_ids s32 entries (cid for each cpu)
> + * @cpu_to_cid__sz: must be nr_cpu_ids * sizeof(s32) bytes
> + * @aux: implicit BPF argument to access bpf_prog_aux hidden from BPF progs
> + *
> + * May only be called from ops.init() of the root scheduler. Replace the
> + * topology-probed cid mapping with the caller-provided one. Each possible cpu
> + * must map to a unique cid in [0, num_possible_cpus()). Topo info is cleared.
> + * On invalid input, trigger scx_error() to abort the scheduler.
> + */
> +__bpf_kfunc void scx_bpf_cid_override(const s32 *cpu_to_cid, u32 cpu_to_cid__sz,
> +				      const struct bpf_prog_aux *aux)
> +{
> +	cpumask_var_t seen __free(free_cpumask_var) = CPUMASK_VAR_NULL;
> +	struct scx_sched *sch;
> +	bool alloced;
> +	s32 cpu, cid;
> +
> +	/* GFP_KERNEL alloc must happen before the rcu read section */
> +	alloced = zalloc_cpumask_var(&seen, GFP_KERNEL);
> +
> +	guard(rcu)();
> +
> +	sch = scx_prog_sched(aux);
> +	if (unlikely(!sch))
> +		return;
> +
> +	if (!alloced) {
> +		scx_error(sch, "scx_bpf_cid_override: failed to allocate cpumask");
> +		return;
> +	}
> +
> +	if (scx_parent(sch)) {
> +		scx_error(sch, "scx_bpf_cid_override() only allowed from root sched");
> +		return;
> +	}
> +
> +	if (cpu_to_cid__sz != nr_cpu_ids * sizeof(s32)) {
> +		scx_error(sch, "scx_bpf_cid_override: expected %zu bytes, got %u",
> +			  nr_cpu_ids * sizeof(s32), cpu_to_cid__sz);
> +		return;
> +	}
> +
> +	for_each_possible_cpu(cpu) {
> +		s32 c = cpu_to_cid[cpu];
> +
> +		if (!cid_valid(sch, c))
> +			return;
> +		if (cpumask_test_and_set_cpu(c, seen)) {
> +			scx_error(sch, "cid %d assigned to multiple cpus", c);
> +			return;
> +		}
> +		scx_cpu_to_cid_tbl[cpu] = c;
> +		scx_cid_to_cpu_tbl[c] = cpu;
> +	}
> +
> +	/* Invalidate stale topo info - the override carries no topology. */
> +	for (cid = 0; cid < num_possible_cpus(); cid++)
> +		scx_cid_topo[cid] = SCX_CID_TOPO_NEG;

Considering that the topology info is wiped when scx_bpf_cid_override() is used,
should we error if a scheduler is also trying to use scx_bpf_cid_topo() (i.e.,
setting a flag or similar)?

Thanks,
-Andrea

  reply	other threads:[~2026-04-29 14:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 20:35 [PATCHSET v3 sched_ext/for-7.2] sched_ext: Topological CPU IDs and cid-form struct_ops Tejun Heo
2026-04-28 20:35 ` [PATCH 01/17] sched_ext: Add ext_types.h for early subsystem-wide defs Tejun Heo
2026-04-28 20:35 ` [PATCH 02/17] sched_ext: Rename ops_cpu_valid() to scx_cpu_valid() and expose it Tejun Heo
2026-04-28 20:35 ` [PATCH 03/17] sched_ext: Move scx_exit(), scx_error() and friends to ext_internal.h Tejun Heo
2026-04-28 20:35 ` [PATCH 04/17] sched_ext: Shift scx_kick_cpu() validity check to scx_bpf_kick_cpu() Tejun Heo
2026-04-28 20:35 ` [PATCH 05/17] sched_ext: Relocate cpu_acquire/cpu_release to end of struct sched_ext_ops Tejun Heo
2026-04-28 20:35 ` [PATCH 06/17] sched_ext: Make scx_enable() take scx_enable_cmd Tejun Heo
2026-04-28 20:35 ` [PATCH 07/17] sched_ext: Add topological CPU IDs (cids) Tejun Heo
2026-04-28 20:35 ` [PATCH 08/17] sched_ext: Add scx_bpf_cid_override() kfunc Tejun Heo
2026-04-29 14:07   ` Andrea Righi [this message]
2026-04-28 20:35 ` [PATCH 09/17] tools/sched_ext: Add struct_size() helpers to common.bpf.h Tejun Heo
2026-04-28 20:35 ` [PATCH 10/17] sched_ext: Add cmask, a base-windowed bitmap over cid space Tejun Heo
2026-04-29 12:47   ` Changwoo Min
2026-04-28 20:35 ` [PATCH 11/17] sched_ext: Add cid-form kfunc wrappers alongside cpu-form Tejun Heo
2026-04-28 20:35 ` [PATCH 12/17] sched_ext: Add bpf_sched_ext_ops_cid struct_ops type Tejun Heo
2026-04-28 20:35 ` [PATCH 13/17] sched_ext: Forbid cpu-form kfuncs from cid-form schedulers Tejun Heo
2026-04-28 20:35 ` [PATCH 14/17] tools/sched_ext: scx_qmap: Restart on hotplug instead of cpu_online/offline Tejun Heo
2026-04-28 20:35 ` [PATCH 15/17] tools/sched_ext: scx_qmap: Add cmask-based idle tracking and cid-based idle pick Tejun Heo
2026-04-28 20:35 ` [PATCH 16/17] tools/sched_ext: scx_qmap: Port to cid-form struct_ops Tejun Heo
2026-04-29 12:47   ` Changwoo Min
2026-04-29 13:53     ` Andrea Righi
2026-04-28 20:35 ` [PATCH 17/17] sched_ext: Require cid-form struct_ops for sub-sched support Tejun Heo
2026-04-29 12:49 ` [PATCHSET v3 sched_ext/for-7.2] sched_ext: Topological CPU IDs and cid-form struct_ops Changwoo Min
2026-04-29 13:29 ` Andrea Righi
2026-04-29 14:11   ` Andrea Righi
  -- strict thread matches above, loose matches on Subject: below --
2026-04-24 17:27 [PATCHSET v2 REPOST " Tejun Heo
2026-04-24 17:27 ` [PATCH 08/17] sched_ext: Add scx_bpf_cid_override() kfunc Tejun Heo
2026-04-24  1:32 Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afIQmB4H6li6jL4I@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox