All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Vernet <void@manifault.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
	schatzberg.dan@gmail.com, mingo@redhat.com, peterz@infradead.org,
	changwoo@igalia.com, righi.andrea@gmail.com
Subject: Re: [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues
Date: Wed, 10 Jul 2024 14:25:23 -0500	[thread overview]
Message-ID: <20240710192523.GF317151@maniforge> (raw)
In-Reply-To: <20240709212137.1199269-7-tj@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 5282 bytes --]

On Tue, Jul 09, 2024 at 11:21:12AM -1000, Tejun Heo wrote:
> Because there was no way to directly dispatch to the local DSQ of a remote
> CPU from ops.enqueue(), scx_qmap skipped looking for an idle CPU on !wakeup
> enqueues. This restriction was removed and schbed_ext now allows

s/schbed_ext/sched_ext

> SCX_DSQ_LOCAL_ON verdicts for direct dispatches.
> 
> Factor out pick_direct_dispatch_cpu() from ops.select_cpu() and use it to
> direct dispatch from ops.enqueue() on !wakeup enqueues.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: David Vernet <void@manifault.com>
> Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
> Cc: Changwoo Min <changwoo@igalia.com>
> Cc: Andrea Righi <righi.andrea@gmail.com>

Hi Tejun,

This LG as is, but I also left a comment below in case we want to tweak. Feel
free to just apply the tag if you'd rather not iterate given that this is just
an example scheduler.

Acked-by: David Vernet <void@manifault.com>

> ---
>  tools/sched_ext/scx_qmap.bpf.c | 39 ++++++++++++++++++++++++++--------
>  tools/sched_ext/scx_qmap.c     |  5 +++--
>  2 files changed, 33 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c
> index 27e35066a602..892278f12dce 100644
> --- a/tools/sched_ext/scx_qmap.bpf.c
> +++ b/tools/sched_ext/scx_qmap.bpf.c
> @@ -120,11 +120,26 @@ struct {
>  } cpu_ctx_stor SEC(".maps");
>  
>  /* Statistics */
> -u64 nr_enqueued, nr_dispatched, nr_reenqueued, nr_dequeued;
> +u64 nr_enqueued, nr_dispatched, nr_reenqueued, nr_dequeued, nr_ddsp_from_enq;
>  u64 nr_core_sched_execed;
>  u32 cpuperf_min, cpuperf_avg, cpuperf_max;
>  u32 cpuperf_target_min, cpuperf_target_avg, cpuperf_target_max;
>  
> +static s32 pick_direct_dispatch_cpu(struct task_struct *p, s32 prev_cpu)
> +{
> +	s32 cpu;
> +
> +	if (p->nr_cpus_allowed == 1 ||
> +	    scx_bpf_test_and_clear_cpu_idle(prev_cpu))
> +		return prev_cpu;
> +
> +	cpu = scx_bpf_pick_idle_cpu(p->cpus_ptr, 0);
> +	if (cpu >= 0)
> +		return cpu;
> +
> +	return -1;
> +}
> +
>  s32 BPF_STRUCT_OPS(qmap_select_cpu, struct task_struct *p,
>  		   s32 prev_cpu, u64 wake_flags)
>  {
> @@ -137,17 +152,14 @@ s32 BPF_STRUCT_OPS(qmap_select_cpu, struct task_struct *p,
>  		return -ESRCH;
>  	}
>  
> -	if (p->nr_cpus_allowed == 1 ||
> -	    scx_bpf_test_and_clear_cpu_idle(prev_cpu)) {
> +	cpu = pick_direct_dispatch_cpu(p, prev_cpu);
> +
> +	if (cpu >= 0) {
>  		tctx->force_local = true;
> +		return cpu;
> +	} else {
>  		return prev_cpu;
>  	}
> -
> -	cpu = scx_bpf_pick_idle_cpu(p->cpus_ptr, 0);
> -	if (cpu >= 0)
> -		return cpu;
> -
> -	return prev_cpu;
>  }
>  
>  static int weight_to_idx(u32 weight)
> @@ -172,6 +184,7 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
>  	u32 pid = p->pid;
>  	int idx = weight_to_idx(p->scx.weight);
>  	void *ring;
> +	s32 cpu;
>  
>  	if (p->flags & PF_KTHREAD) {
>  		if (stall_kernel_nth && !(++kernel_cnt % stall_kernel_nth))
> @@ -207,6 +220,14 @@ void BPF_STRUCT_OPS(qmap_enqueue, struct task_struct *p, u64 enq_flags)
>  		return;
>  	}
>  
> +	/* if !WAKEUP, select_cpu() wasn't called, try direct dispatch */
> +	if (!(enq_flags & SCX_ENQ_WAKEUP) &&
> +	    (cpu = pick_direct_dispatch_cpu(p, scx_bpf_task_cpu(p))) >= 0) {
> +		__sync_fetch_and_add(&nr_ddsp_from_enq, 1);
> +		scx_bpf_dispatch(p, SCX_DSQ_LOCAL_ON | cpu, slice_ns, enq_flags);
> +		return;
> +	}

Hmm, will this be a typical pattern for how this is used? I'd expect
ops.select_cpu() and ops.enqueue() to quite often be nearly the same
implementation. Meaning you would e.g. try to find an idle core in both, and do
SCX_DSQ_LOCAL_ON, with the difference being that you'd just return the cpu and
save the extra lock juggling if you did it on the ops.select_cpu() path. Not a
huge deal given that it's just an example scheduler, but it might be a good
idea to try and mirror typical use cases for that reason as well so readers get
an idea of what a typical pattern would look like.

> +
>  	/*
>  	 * If the task was re-enqueued due to the CPU being preempted by a
>  	 * higher priority scheduling class, just re-enqueue the task directly
> diff --git a/tools/sched_ext/scx_qmap.c b/tools/sched_ext/scx_qmap.c
> index 304f0488a386..c9ca30d62b2b 100644
> --- a/tools/sched_ext/scx_qmap.c
> +++ b/tools/sched_ext/scx_qmap.c
> @@ -116,10 +116,11 @@ int main(int argc, char **argv)
>  		long nr_enqueued = skel->bss->nr_enqueued;
>  		long nr_dispatched = skel->bss->nr_dispatched;
>  
> -		printf("stats  : enq=%lu dsp=%lu delta=%ld reenq=%"PRIu64" deq=%"PRIu64" core=%"PRIu64"\n",
> +		printf("stats  : enq=%lu dsp=%lu delta=%ld reenq=%"PRIu64" deq=%"PRIu64" core=%"PRIu64" enq_ddsp=%"PRIu64"\n",
>  		       nr_enqueued, nr_dispatched, nr_enqueued - nr_dispatched,
>  		       skel->bss->nr_reenqueued, skel->bss->nr_dequeued,
> -		       skel->bss->nr_core_sched_execed);
> +		       skel->bss->nr_core_sched_execed,
> +		       skel->bss->nr_ddsp_from_enq);
>  		if (__COMPAT_has_ksym("scx_bpf_cpuperf_cur"))
>  			printf("cpuperf: cur min/avg/max=%u/%u/%u target min/avg/max=%u/%u/%u\n",
>  			       skel->bss->cpuperf_min,
> -- 
> 2.45.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2024-07-10 19:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-09 21:21 [PATCHSET sched_ext/for-6.11] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Tejun Heo
2024-07-09 21:21 ` [PATCH 1/6] sched: Move struct balance_callback definition upward Tejun Heo
2024-07-10 18:53   ` David Vernet
2024-07-09 21:21 ` [PATCH 2/6] sched_ext: Open-code task_linked_on_dsq() Tejun Heo
2024-07-10  8:56   ` Andrea Righi
2024-07-10 18:53   ` David Vernet
2024-07-09 21:21 ` [PATCH 3/6] sched_ext: Make @rf optional for dispatch_to_local_dsq() Tejun Heo
2024-07-10  8:59   ` Andrea Righi
2024-07-10 11:41     ` Peter Zijlstra
2024-07-10 16:39       ` Tejun Heo
2024-07-10 16:10   ` David Vernet
2024-07-10 16:46     ` Tejun Heo
2024-07-09 21:21 ` [PATCH 4/6] sched_ext: s/SCX_RQ_BALANCING/SCX_RQ_IN_BALANCE/ and add SCX_RQ_IN_WAKEUP Tejun Heo
2024-07-10 18:54   ` David Vernet
2024-07-09 21:21 ` [PATCH 5/6] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Tejun Heo
2024-07-10 18:51   ` David Vernet
2024-07-10 23:43     ` Tejun Heo
2024-07-09 21:21 ` [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues Tejun Heo
2024-07-10 19:25   ` David Vernet [this message]
2024-07-11  0:05     ` Tejun Heo
2024-07-10  9:01 ` [PATCHSET sched_ext/for-6.11] sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches Andrea Righi
  -- strict thread matches above, loose matches on Subject: below --
2024-07-11  1:13 [PATCHSET v2 " Tejun Heo
2024-07-11  1:14 ` [PATCH 6/6] sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240710192523.GF317151@maniforge \
    --to=void@manifault.com \
    --cc=changwoo@igalia.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=righi.andrea@gmail.com \
    --cc=schatzberg.dan@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.