public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Vernet <void@manifault.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	schatzberg.dan@gmail.com, mingo@redhat.com, peterz@infradead.org,
	changwoo@igalia.com, righi.andrea@gmail.com
Subject: Re: [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues
Date: Wed, 10 Jul 2024 14:25:23 -0500	[thread overview]
Message-ID: <20240710192523.GF317151@maniforge> (raw)
In-Reply-To: <20240709212137.1199269-7-tj@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 5282 bytes --]

On Tue, Jul 09, 2024 at 11:21:12AM -1000, Tejun Heo wrote:
> Because there was no way to directly dispatch to the local DSQ of a remote
> CPU from ops.enqueue(), scx_qmap skipped looking for an idle CPU on !wakeup
> enqueues. This restriction was removed and schbed_ext now allows

s/schbed_ext/sched_ext

> SCX_DSQ_LOCAL_ON verdicts for direct dispatches.
> 
> Factor out pick_direct_dispatch_cpu() from ops.select_cpu() and use it to
> direct dispatch from ops.enqueue() on !wakeup enqueues.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: David Vernet <void@manifault.com>
> Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
> Cc: Changwoo Min <changwoo@igalia.com>
> Cc: Andrea Righi <righi.andrea@gmail.com>

Hi Tejun,

This LG as is, but I also left a comment below in case we want to tweak. Feel
free to just apply the tag if you'd rather not iterate given that this is just
an example scheduler.

Acked-by: David Vernet <void@manifault.com>

> ---
>  tools/sched_ext/scx_qmap.bpf.c | 39 ++++++++++++++++++++++++++--------
>  tools/sched_ext/scx_qmap.c     |  5 +++--
>  2 files changed, 33 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c
> index 27e35066a602..892278f12dce 100644
> --- a/tools/sched_ext/scx_qmap.bpf.c
> +++ b/tools/sched_ext/scx_qmap.bpf.c
> @@ -120,11 +120,26 @@ struct {
>  } cpu_ctx_stor SEC(".maps");
>  
>  /* Statistics */
> -u64 nr_enqueued, nr_dispatched, nr_reenqueued, nr_dequeued;
> +u64 nr_enqueued, nr_dispatched, nr_reenqueued, nr_dequeued, nr_ddsp_from_enq;
>  u64 nr_core_sched_execed;
>  u32 cpuperf_min, cpuperf_avg, cpuperf_max;
>  u32 cpuperf_target_min, cpuperf_target_avg, cpuperf_target_max;
>  
> +static s32 pick_direct_dispatch_cpu(struct task_struct *p, s32 prev_cpu)
> +{
> +	s32 cpu;
> +
> +	if (p->nr_cpus_allowed == 1 ||
> +	    scx_bpf_test_and_clear_cpu_idle(prev_cpu))
> +		return prev_cpu;
> +
> +	cpu = scx_bpf_pick_idle_cpu(p->cpus_ptr, 0);
> +	if (cpu >= 0)
> +		return cpu;
> +
> +	return -1;
> +}
> +
>  s32 BPF_STRUCT_OPS(qmap_select_cpu, struct task_struct *p,
>  		   s32 prev_cpu, u64 wake_flags)
>  {
> @@ -137,17 +152,14 @@ s32 BPF_STRUCT_OPS(qmap_select_cpu, struct task_struct *p,
>  		return -ESRCH;
>  	}
>  
> -	if (p->nr_cpus_allowed == 1 ||
> -	    scx_bpf_test_and_clear_cpu_idle(prev_cpu)) {
> +	cpu = pick_direct_dispatch_cpu(p, prev_cpu);
> +
> +	if (cpu >= 0) {
>  		tctx->force_local = true;
> +		return cpu;
> +	} else {
>  		return prev_cpu;
>  	}
> -
> -	cpu = scx_bpf_pick_idle_cpu(p->cpus_ptr, 0);
> -	if (cpu >= 0)
> -		return cpu;
> -
> -	return prev_cpu;
>  }
>  
>  static int weight_to_idx(u32 weight)
> @@ -172,6 +184,7 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
>  	u32 pid = p->pid;
>  	int idx = weight_to_idx(p->scx.weight);
>  	void *ring;
> +	s32 cpu;
>  
>  	if (p->flags & PF_KTHREAD) {
>  		if (stall_kernel_nth && !(++kernel_cnt % stall_kernel_nth))
> @@ -207,6 +220,14 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
>  		return;
>  	}
>  
> +	/* if !WAKEUP, select_cpu() wasn't called, try direct dispatch */
> +	if (!(enq_flags & SCX_ENQ_WAKEUP) &&
> +	    (cpu = pick_direct_dispatch_cpu(p, scx_bpf_task_cpu(p))) >= 0) {
> +		__sync_fetch_and_add(&nr_ddsp_from_enq, 1);
> +		scx_bpf_dispatch(p, SCX_DSQ_LOCAL_ON | cpu, slice_ns, enq_flags);
> +		return;
> +	}

Hmm, will this be a typical pattern for how this is used? I'd expect
ops.select_cpu() and ops.enqueue() to quite often be nearly the same
implementation. Meaning you would e.g. try to find an idle core in both, and do
SCX_DSQ_LOCAL_ON, with the difference being that you'd just return the cpu and
save the extra lock juggling if you did it on the ops.select_cpu() path. Not a
huge deal given that it's just an example scheduler, but it might be a good
idea to try and mirror typical use cases for that reason as well so readers get
an idea of what a typical pattern would look like.

> +
>  	/*
>  	 * If the task was re-enqueued due to the CPU being preempted by a
>  	 * higher priority scheduling class, just re-enqueue the task directly
> diff --git a/tools/sched_ext/scx_qmap.c b/tools/sched_ext/scx_qmap.c
> index 304f0488a386..c9ca30d62b2b 100644
> --- a/tools/sched_ext/scx_qmap.c
> +++ b/tools/sched_ext/scx_qmap.c
> @@ -116,10 +116,11 @@ int main(int argc, char **argv)
>  		long nr_enqueued = skel->bss->nr_enqueued;
>  		long nr_dispatched = skel->bss->nr_dispatched;
>  
> -		printf("stats  : enq=%lu dsp=%lu delta=%ld reenq=%"PRIu64" deq=%"PRIu64" core=%"PRIu64"\n",
> +		printf("stats  : enq=%lu dsp=%lu delta=%ld reenq=%"PRIu64" deq=%"PRIu64" core=%"PRIu64" enq_ddsp=%"PRIu64"\n",
>  		       nr_enqueued, nr_dispatched, nr_enqueued - nr_dispatched,
>  		       skel->bss->nr_reenqueued, skel->bss->nr_dequeued,
> -		       skel->bss->nr_core_sched_execed);
> +		       skel->bss->nr_core_sched_execed,
> +		       skel->bss->nr_ddsp_from_enq);
>  		if (__COMPAT_has_ksym("scx_bpf_cpuperf_cur"))
>  			printf("cpuperf: cur min/avg/max=%u/%u/%u target min/avg/max=%u/%u/%u\n",
>  			       skel->bss->cpuperf_min,
> -- 
> 2.45.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2024-07-10 19:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-09 21:21 [PATCHSET sched_ext/for-6.11] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Tejun Heo
2024-07-09 21:21 ` [PATCH 1/6] sched: Move struct balance_callback definition upward Tejun Heo
2024-07-10 18:53   ` David Vernet
2024-07-09 21:21 ` [PATCH 2/6] sched_ext: Open-code task_linked_on_dsq() Tejun Heo
2024-07-10  8:56   ` Andrea Righi
2024-07-10 18:53   ` David Vernet
2024-07-09 21:21 ` [PATCH 3/6] sched_ext: Make @rf optional for dispatch_to_local_dsq() Tejun Heo
2024-07-10  8:59   ` Andrea Righi
2024-07-10 11:41     ` Peter Zijlstra
2024-07-10 16:39       ` Tejun Heo
2024-07-10 16:10   ` David Vernet
2024-07-10 16:46     ` Tejun Heo
2024-07-09 21:21 ` [PATCH 4/6] sched_ext: s/SCX_RQ_BALANCING/SCX_RQ_IN_BALANCE/ and add SCX_RQ_IN_WAKEUP Tejun Heo
2024-07-10 18:54   ` David Vernet
2024-07-09 21:21 ` [PATCH 5/6] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Tejun Heo
2024-07-10 18:51   ` David Vernet
2024-07-10 23:43     ` Tejun Heo
2024-07-09 21:21 ` [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues Tejun Heo
2024-07-10 19:25   ` David Vernet [this message]
2024-07-11  0:05     ` Tejun Heo
2024-07-10  9:01 ` [PATCHSET sched_ext/for-6.11] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Andrea Righi
  -- strict thread matches above, loose matches on Subject: below --
2024-07-11  1:13 [PATCHSET v2 " Tejun Heo
2024-07-11  1:14 ` [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240710192523.GF317151@maniforge \
    --to=void@manifault.com \
    --cc=changwoo@igalia.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=righi.andrea@gmail.com \
    --cc=schatzberg.dan@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox