public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Changwoo Min <multics69@gmail.com>
Cc: tj@kernel.org, void@manifault.com, mingo@redhat.com,
	peterz@infradead.org, changwoo@igalia.com, kernel-dev@igalia.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/6] sched_ext: Support high-performance monotonically non-decreasing clock
Date: Mon, 9 Dec 2024 10:51:44 +0100	[thread overview]
Message-ID: <Z1a9sKyELhH-e4lJ@gpd3> (raw)
In-Reply-To: <20241209061531.257531-1-changwoo@igalia.com>

On Mon, Dec 09, 2024 at 03:15:25PM +0900, Changwoo Min wrote:
> Many BPF schedulers (such as scx_central, scx_lavd, scx_rusty, scx_bpfland,
> and scx_flash) frequently call bpf_ktime_get_ns() for tracking tasks' runtime
> properties. If supported, bpf_ktime_get_ns() eventually reads a hardware
> timestamp counter (TSC). However, reading a hardware TSC is not
> performant in some hardware platforms, degrading IPC.
> 
> This patchset addresses the performance problem of reading hardware TSC
> by leveraging the rq clock in the scheduler core, introducing a
> scx_bpf_now_ns() function for BPF schedulers. Whenever the rq clock
> is fresh and valid, scx_bpf_now_ns() provides the rq clock, which is
> already updated by the scheduler core (update_rq_clock), so it can reduce
> reading the hardware TSC.
> 
> When the rq lock is released (rq_unpin_lock), the rq clock is invalidated,
> so a subsequent scx_bpf_now_ns() call gets the fresh sched_clock for the caller.
> 
> In addition, scx_bpf_now_ns() guarantees the clock is monotonically
> non-decreasing for the same CPU, so the clock cannot go backward
> in the same CPU.
> 
> Using scx_bpf_now_ns() reduces the number of reading hardware TSC
> by 40-70% (65% for scx_lavd, 58% for scx_bpfland, and 43% for scx_rusty)
> for the following benchmark:
> 
>     perf bench -f simple sched messaging -t -g 20 -l 6000
> 
> The patchset begins by managing the status of rq clock in the scheduler
> core, then implementing scx_bpf_now_ns(), and finally applying it to the
> BPF schedulers.

I left a few comments, but overall it looks good to me. I also ran some
tests with this applied and a modified scx_bpfland to use the new
scx_bpf_now_ns(), no issue to report, therefore:

Acked-by: Andrea Righi <arighi@nvidia.com>

> 
> ChangwLog v3 -> v4:
>   - Separate the code relocation related to scx_enabled() into a
>     separate patch.
>   - Remove scx_rq_clock_stale() after (or before) ops.running() and
>     ops.update_idle() calls
>   - Rename scx_bpf_clock_get_ns() into scx_bpf_now_ns() and revise it to
>     address the comments
>   - Move the per-CPU variable holding a prev clock into scx_rq
>     (rq->scx.prev_clock)
>   - Add a comment describing when the clock could go backward in
>     scx_bpf_now_ns()
>   - Rebase the code to the tip of Tejun's sched_ext repo (for-next
>     branch)
> 
> ChangeLog v2 -> v3:
>   - To avoid unnecessarily modifying cache lines, scx_rq_clock_update()
>     and scx_rq_clock_stale() update the clock and flags only when a
>     sched_ext scheduler is enabled.
> 
> ChangeLog v1 -> v2:
>   - Rename SCX_RQ_CLK_UPDATED to SCX_RQ_CLK_VALID to denote the validity
>     of an rq clock clearly.
>   - Rearrange the clock and flags fields in struct scx_rq to make sure
>     they are in the same cacheline to minimize the cache misses 
>   - Add an additional explanation to the commit message in the 2/5 patch
>     describing when the rq clock will be reused with an example.
>   - Fix typos
>   - Rebase the code to the tip of Tejun's sched_ext repo
> 
> Changwoo Min (6):
>   sched_ext: Relocate scx_enabled() related code
>   sched_ext: Implement scx_rq_clock_update/stale()
>   sched_ext: Manage the validity of scx_rq_clock
>   sched_ext: Implement scx_bpf_now_ns()
>   sched_ext: Add scx_bpf_now_ns() for BPF scheduler
>   sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now_ns()
> 
>  kernel/sched/core.c                      |  6 +-
>  kernel/sched/ext.c                       | 73 ++++++++++++++++++++++++
>  kernel/sched/sched.h                     | 52 ++++++++++++-----
>  tools/sched_ext/include/scx/common.bpf.h |  1 +
>  tools/sched_ext/include/scx/compat.bpf.h |  5 ++
>  tools/sched_ext/scx_central.bpf.c        |  4 +-
>  tools/sched_ext/scx_flatcg.bpf.c         |  2 +-
>  7 files changed, 124 insertions(+), 19 deletions(-)
> 
> -- 
> 2.47.1
> 

-Andrea

      parent reply	other threads:[~2024-12-09  9:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-09  6:15 [PATCH v4 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Changwoo Min
2024-12-09  6:15 ` [PATCH v4 1/6] sched_ext: Relocate scx_enabled() related code Changwoo Min
2024-12-11  7:27   ` Tejun Heo
2024-12-11  7:37     ` Tejun Heo
2024-12-09  6:15 ` [PATCH v4 2/6] sched_ext: Implement scx_rq_clock_update/stale() Changwoo Min
2024-12-09  9:40   ` Andrea Righi
2024-12-10  7:21     ` Changwoo Min
2024-12-11  7:43   ` Tejun Heo
2024-12-13  1:16     ` Changwoo Min
2024-12-09  6:15 ` [PATCH v4 3/6] sched_ext: Manage the validity of scx_rq_clock Changwoo Min
2024-12-09  6:15 ` [PATCH v4 4/6] sched_ext: Implement scx_bpf_now_ns() Changwoo Min
2024-12-11  8:14   ` Tejun Heo
2024-12-13  1:41     ` Changwoo Min
2024-12-11  9:32   ` Peter Zijlstra
2024-12-13  2:01     ` Changwoo Min
2024-12-09  6:15 ` [PATCH v4 5/6] sched_ext: Add scx_bpf_now_ns() for BPF scheduler Changwoo Min
2024-12-09  6:15 ` [PATCH v4 6/6] sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now_ns() Changwoo Min
2024-12-09  9:51 ` Andrea Righi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1a9sKyELhH-e4lJ@gpd3 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=kernel-dev@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=multics69@gmail.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox