From: K Prateek Nayak <kprateek.nayak@amd.com>
To: "Chen, Yu C" <yu.c.chen@intel.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
David Vernet <void@manifault.com>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
"Swapnil Sapkal" <swapnil.sapkal@amd.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
<yu.chen.surf@foxmail.com>, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 7/8] sched/fair: Retrieve cached group stats from sg_lb_stats_prop
Date: Wed, 19 Mar 2025 12:12:53 +0530 [thread overview]
Message-ID: <beaf9eca-33d9-4fc5-ae9f-64e50b2663cc@amd.com> (raw)
In-Reply-To: <3915af7f-fe05-4a1b-a8b2-e9690827b916@intel.com>
Hello Chenyu,
Thank you for taking a look at the series.
On 3/17/2025 11:34 PM, Chen, Yu C wrote:
> On 3/13/2025 5:37 PM, K Prateek Nayak wrote:
>> Allow update_sg_lb_stats() to retrieve the group stats cached in
>> sg_lb_stats_prop saved by another CPU performing load balancing around
>> the same time (same jiffy)
>>
>
> If I understand correctly, we allow update_sg_lb_stats() to retrieve
> cached sg stats if another CPU in the same sched group has just done
> load balance within a jiffy ago, say 10ms for CONFIG_100_HZ.
Quick disclaimer: All of this is best effort currently.
Periodic load balancing is easy to start since it only happens once a
jiffy at the maximum so "last_update" as a jiffy counter should be
good enough (in most cases).
Secondly, and this is slightly harder to solve for, is to get all the
CPUs to actually sync. Currently it is a best effort case since the
tick can fire late due to disabled interrupts on CPU, SCHED_SOFTIRQ
may run at different times depending on how much work is done at tick
prior to reaching the softirq handler etc.
But assuming some amount of sync, I would like:
- During busy balance only one CPU gets to proceed as per
should_we_balance() heuristics. In addition to that, since all CPUs
are busy (should_we_balance() would have allowed the first idle CPU
to go ahead otherwise) the "idle_cpus" and "overloaded" situations
may change and those are hard to propagate.
- By the time this CPU does busy balancing, other groups below it
hopefully had enough time to reach update_sd_lb_stats() and cache
their copy for this jiffy in there. If not - the load balancing CPU
will recompute.
- Since stats at a higher domain is used only once, there was no need
to invalidate them which I couldn't get right back then (or maybe
even now :)
>
> There are two roles, writer who updates the cached stats,
> the reader who reads the cache stats. For both cache writer and
> the cache reader, do we trigger them only when it is in busy periodic
> load balance? If yes, consider the periodic load balance is usually
> triggered on 1 CPU in each SD(should_we_balance()), and the
> interval increases with the number of CPUs in that sd, just wonder
> if 10 ms is a little short to find a cached stats on large LLC?
So the reader is always the CPU going to the higher domain and
recomputing stats. The writer should have updated the stats by then
or the reader won't care and recompute it.
At the very least, since the CPU has to look at local stats too, the
logic ensures at least that is reused and not recomputed.
Beyond the annotated PATCH 9, I've moved to a versioning scheme that
could also be reused for newidle balancing with stats invalidation
and that should help reuse stats more. There are some stats on the
empty PATCH 9.
--
Thanks and Regards,
Prateek
> thanks,
> Chenyu
>
>
>> Current implementation without invalidation of cached stats have few
>> limitations namely that the stats reuse is limited to busy load
>> balancing since stats can only be updated once a jiffy. Newidle Balance
>> can happen frequently and concurrently on many CPUs which can result in
>> readers seeing inconsitent values for the propagated stats.
>>
>> For this iteration, the focus is to reduce the time taken for busy load
>> balancing allowing the busy CPU to resume renning the task as quickly as
>> possible.
>>
>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
>> ---
next prev parent reply other threads:[~2025-03-19 6:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 9:37 [RFC PATCH 0/8] sched/fair: Propagate load balancing stats up the sched domain hierarchy K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 1/8] sched/topology: Assign sd_share for all non NUMA sched domains K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 2/8] sched/topology: Introduce sg->shared K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 3/8] sched/fair: Move "struct sg_lb_stats" and its dependencies to sched.h K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 4/8] sched/fair: Move sg_{overloaded,overutilized} calculation to sg_lb_stats K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 5/8] sched/topology: Define sg_lb_stats_prop and embed it inside sched_domain_shared K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 6/8] sched/fair: Increase probability of lb stats being reused K Prateek Nayak
2025-03-17 18:07 ` Chen, Yu C
2025-03-19 6:51 ` K Prateek Nayak
2025-03-13 9:37 ` [RFC PATCH 7/8] sched/fair: Retrieve cached group stats from sg_lb_stats_prop K Prateek Nayak
2025-03-17 18:04 ` Chen, Yu C
2025-03-19 6:42 ` K Prateek Nayak [this message]
2025-03-13 9:37 ` [RFC PATCH 8/8] sched/fair: Update stats for sched_domain using the sched_group stats K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 09/08] [ANNOTATE] sched/fair: Stats versioning and invalidation K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 10/08] sched/fair: Compute nr_{numa,preferred}_running for non-NUMA domains K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 11/08] sched/fair: Move from "last_update" to stats versioning K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 12/08] sched/fair: Record the cpu that updated the stats last K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 13/08] sched/fair: Invalidate stats once the load balancing instance is done K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 14/08] [DEBUG] sched/fair: Add more lb_stats around lb_time and stats reuse K Prateek Nayak
2025-03-16 10:29 ` [RFC PATCH 15/08] [DEBUG] tools/lib/perf: Extend schedstats v17 headers to include the new debug fields K Prateek Nayak
2025-03-17 17:25 ` [RFC PATCH 0/8] sched/fair: Propagate load balancing stats up the sched domain hierarchy Peter Zijlstra
2025-03-17 18:23 ` Chen, Yu C
2025-03-21 10:04 ` Libo Chen
2025-03-24 3:58 ` K Prateek Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=beaf9eca-33d9-4fc5-ae9f-64e50b2663cc@amd.com \
--to=kprateek.nayak@amd.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=gautham.shenoy@amd.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=swapnil.sapkal@amd.com \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
--cc=yu.c.chen@intel.com \
--cc=yu.chen.surf@foxmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox