From: Breno Leitao <leitao@debian.org>
To: Jeremy Linton <jeremy.linton@arm.com>
Cc: linux-pm@vger.kernel.org, sumitg@nvidia.com,
pierre.gondois@arm.com, zhenglifeng1@huawei.com,
zhanjie9@hisilicon.com, viresh.kumar@linaro.org,
rafael@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cpufreq: cppc: Reduce cppc delivered perf sampling jitter
Date: Thu, 4 Jun 2026 01:12:03 -0700 [thread overview]
Message-ID: <aiCI__XLprPPZ14D@gmail.com> (raw)
In-Reply-To: <f047de36-5d0a-489e-84ee-4bab0a1fd8d4@arm.com>
On Wed, Jun 03, 2026 at 11:46:51AM -0500, Jeremy Linton wrote:
> Hi,
>
> Thanks for looking at this.
>
> On 6/3/26 5:54 AM, Breno Leitao wrote:
> > On Tue, Jun 02, 2026 at 04:20:52PM -0500, Jeremy Linton wrote:
> > > CPPC uses a pair of registers cycling at different frequencies to
> > > determine an accumulated performance level. For userspace reporting we
> > > want to convert this to an instantaneous CPU frequency, but over short
> > > time periods small errors caused by CPPC counter reads can cause
> > > fairly significant reported frequency variations even when the core
> > > CPU clock isn't changing.
> > >
> > > Reduce this by keeping a start sample fixed and retrying the end
> > > sample until the counter deltas are large enough to reduce short
> > > window error, or until adjacent delivered performance estimates are
> > > within the CPU's observed CPPC read noise floor.
> > >
> > > To begin, resample the initial pair a small fixed number of times
> > > looking for matching delivered performance deltas. This reduces the
> > > chance that a disturbed start sample anchors the rest of the
> > > calculation.
> > >
> > > Then look for an end sample while updating the noise floor from the
> > > best error seen between samples. The floor remains zero on systems
> > > with stable feedback reads, but lets noisy systems stop early once
> > > another retry is unlikely to improve the result. The retry loop is
> > > capped at 200 iterations, giving an ~20 usec explicit delay budget
> > > derived from ndelay(100).
> > >
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > > drivers/cpufreq/cppc_cpufreq.c | 68 ++++++++++++++++++++++++++++++----
> > > 1 file changed, 61 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> > > index 7e7f9dfb7a24..362c08def420 100644
> > > --- a/drivers/cpufreq/cppc_cpufreq.c
> > > +++ b/drivers/cpufreq/cppc_cpufreq.c
> > > @@ -50,7 +50,7 @@ struct cppc_freq_invariance {
> > > static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
> > > static struct kthread_worker *kworker_fie;
> > > -static int cppc_perf_from_fbctrs(u64 reference_perf,
> > > +static u64 cppc_perf_from_fbctrs(u64 reference_perf,
> > > struct cppc_perf_fb_ctrs *fb_ctrs_t0,
> > > struct cppc_perf_fb_ctrs *fb_ctrs_t1);
> > > @@ -750,7 +750,7 @@ static inline u64 get_delta(u64 t1, u64 t0)
> > > return (u32)t1 - (u32)t0;
> > > }
> > > -static int cppc_perf_from_fbctrs(u64 reference_perf,
> > > +static u64 cppc_perf_from_fbctrs(u64 reference_perf,
> > > struct cppc_perf_fb_ctrs *fb_ctrs_t0,
> > > struct cppc_perf_fb_ctrs *fb_ctrs_t1)
> > > {
> > > @@ -771,19 +771,71 @@ static int cppc_perf_from_fbctrs(u64 reference_perf,
> > > return (reference_perf * delta_delivered) / delta_reference;
> > > }
> > > -static int cppc_get_perf_ctrs_sample(int cpu,
> > > +/* CPPC read noise floor for early retry exit. */
> > > +static DEFINE_PER_CPU(u64, err_floor);
> > > +
> > > +#define CPPC_SAMPLE_MAX_RETRIES 200
> >
> > Could the remaining tuning literals get the same treatment?
> > Specifically:
> >
> > - the 10 initial-resample iteration count
> > - the 2000 multiplier in ref * 2000
> > - the 100 ns in ndelay(100)
>
> Sure. A few of these were personal judgment from the platforms I tried it
> on. I had some instrumentation at the bottom which was printing loop counts
> and error values and largely I picked those values based on how they were
> behaving, or back of the evelope estimates. For example, that 200 is afaik
> overkill, its usually settles down around 20 or less, which makes this
> faster than the old method on at least one platform I tried it on. And they
> are all intended to be "upper bound" exit the loop because something isn't
> working right values.
>
>
> I'm interested in whether this patch stabilizes the frequency reporting in
> some of the cases I've heard people talking about.
Ack. I will try to get this tested tomorrow, and report it back in here.
Thanks
--breno
next prev parent reply other threads:[~2026-06-04 8:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 21:20 [PATCH] cpufreq: cppc: Reduce cppc delivered perf sampling jitter Jeremy Linton
2026-06-03 10:54 ` Breno Leitao
2026-06-03 16:46 ` Jeremy Linton
2026-06-04 8:12 ` Breno Leitao [this message]
2026-06-04 9:42 ` Breno Leitao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiCI__XLprPPZ14D@gmail.com \
--to=leitao@debian.org \
--cc=jeremy.linton@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=pierre.gondois@arm.com \
--cc=rafael@kernel.org \
--cc=sumitg@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox