From: Andrea Righi <arighi@nvidia.com>
To: Changwoo Min <changwoo@igalia.com>
Cc: tj@kernel.org, void@manifault.com, mingo@redhat.com,
peterz@infradead.org, kernel-dev@igalia.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8 0/6] sched_ext: Support high-performance monotonically non-decreasing clock
Date: Fri, 10 Jan 2025 10:08:19 +0100 [thread overview]
Message-ID: <Z4Djg_Va5wZ90ZoV@gpd3> (raw)
In-Reply-To: <20250109131456.7055-1-changwoo@igalia.com>
Hi Changwoo,
On Thu, Jan 09, 2025 at 10:14:50PM +0900, Changwoo Min wrote:
> Many BPF schedulers (such as scx_central, scx_lavd, scx_rusty, scx_bpfland,
> and scx_flash) frequently call bpf_ktime_get_ns() for tracking tasks' runtime
> properties. If supported, bpf_ktime_get_ns() eventually reads a hardware
> timestamp counter (TSC). However, reading a hardware TSC is not
> performant in some hardware platforms, degrading IPC.
>
> This patchset addresses the performance problem of reading hardware TSC
> by leveraging the rq clock in the scheduler core, introducing a
> scx_bpf_now() function for BPF schedulers. Whenever the rq clock
> is fresh and valid, scx_bpf_now() provides the rq clock, which is
> already updated by the scheduler core (update_rq_clock), so it can reduce
> reading the hardware TSC.
>
> When the rq lock is released (rq_unpin_lock), the rq clock is invalidated,
> so a subsequent scx_bpf_now() call gets the fresh sched_clock for the caller.
>
> In addition, scx_bpf_now() guarantees the clock is monotonically
> non-decreasing for the same CPU, so the clock cannot go backward
> in the same CPU.
>
> Using scx_bpf_now() reduces the number of reading hardware TSC
> by 50-80% (76% for scx_lavd, 82% for scx_bpfland, and 51% for scx_rusty)
> for the following benchmark:
>
> perf bench -f simple sched messaging -t -g 20 -l 6000
Looks good to me, I also ran some stress tests using scx_bpf_now() with
this new patch set and I haven't noticed any issue.
Acked-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
next prev parent reply other threads:[~2025-01-10 9:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-09 13:14 [PATCH v8 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Changwoo Min
2025-01-09 13:14 ` [PATCH v8 1/6] sched_ext: Relocate scx_enabled() related code Changwoo Min
2025-01-09 13:14 ` [PATCH v8 2/6] sched_ext: Implement scx_bpf_now() Changwoo Min
2025-01-10 8:31 ` Peter Zijlstra
2025-01-09 13:14 ` [PATCH v8 3/6] sched_ext: Add scx_bpf_now() for BPF scheduler Changwoo Min
2025-01-09 13:14 ` [PATCH v8 4/6] sched_ext: Add time helpers for BPF schedulers Changwoo Min
2025-01-09 13:14 ` [PATCH v8 5/6] sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now() Changwoo Min
2025-01-09 13:14 ` [PATCH v8 6/6] sched_ext: Use time helpers in BPF schedulers Changwoo Min
2025-01-10 9:08 ` Andrea Righi [this message]
2025-01-10 18:30 ` [PATCH v8 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4Djg_Va5wZ90ZoV@gpd3 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=kernel-dev@igalia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.