From: Morten Rasmussen <morten.rasmussen@arm.com>
To: Sowjanya Komatineni <skomatineni@nvidia.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>,
sudeep.holla@arm.com, souvik.chakravarty@arm.com,
thierry.reding@gmail.com, mark.rutland@arm.com,
lorenzo.pieralisi@arm.com, daniel.lezcano@linaro.org,
robh+dt@kernel.org, jonathanh@nvidia.com, ksitaraman@nvidia.com,
sanjayc@nvidia.com, linux-arm-kernel@lists.infradead.org,
linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org, devicetree@vger.kernel.org
Subject: Re: [RFC PATCH 0/4] Support for passing runtime state idle time to TF-A
Date: Mon, 26 Apr 2021 15:11:50 +0200 [thread overview]
Message-ID: <20210426131150.GA36549@e123083-lin> (raw)
In-Reply-To: <486856be-1e66-fd77-e306-949b91bcdb1d@nvidia.com>
Hi,
On Fri, Apr 23, 2021 at 03:24:51PM -0700, Sowjanya Komatineni wrote:
> On 4/23/21 1:16 PM, Lukasz Luba wrote:
> > Hi Sowjanya,
> >
> > On 4/22/21 9:30 PM, Sowjanya Komatineni wrote:
> > > Tegra194 and Tegra186 platforms use separate MCE firmware for CPUs
> > > which is
> > > in charge of deciding on state transition based on target state,
> > > state idle
> > > time, and some other Tegra CPU core cluster states information.
> > >
> > > Current PSCI specification don't have function defined for passing
> > > runtime
> > > state idle time predicted by governor (based on next events and
> > > state target
> > > residency) to ARM trusted firmware.
> >
> > Do you have some numbers from experiments showing that these idle
> > governor prediction values, which are passed from kernel to MCE
> > firmware, are making a good 'guess'?
> > How much precision (1us? 1ms?) in the values do you need there?
>
> it could also be in few ms depending on when next cpu event/activity might
> happen which is not transparent to MCE firmware.
>
> >
> > IIRC (probably Rafael's presentations) predicting in the kernel
> > something like CPU idle time residency is not a trivial thing.
> >
> > Another idea (depending on DT structure and PSCI bits):
> > Could this be solved differently, but just having a knowledge that if
> > the governor requested some C-state, this means governor 'predicted'
> > an idle residency to be greater that min_residency attached to this
> > C-state?
> > Then, when that request shows up in your FW, you know that it must be at
> > least min_residency because of this C-state id.
> C6 is the only deepest state for Tegra194 Carmel CPU that we support in
> addition to C1 (WFI) idle state.
>
> MCE firmware gets state crossover thresholds for C1 to C6 transition from
> TF-A and uses it along with state idle time to decide on C6 state entry
> based on its background work.
>
> Assuming for now if we use min_residency as state idle time which is static
> value from DT, then it enters into deepest state C6 always as we use
> min_residency value we use is always higher than state crossover threshold.
>
> But MCE firmware is not aware of when next cpu event can happen to predict
> if next event can take longer than state min_residency time.
>
> Using min residency in such case is very conservative where MCE firmware
> exits C6 state early where we may not have better power saving.
>
> But with MCE firmware being aware of when next event can happen it can use
> that to stay in C6 state without early exit for better power savings.
>
> > It would depend on number of available states, max_residency, scale
> > that you would choose while assigning values from [0, max_residency]
> > to each state.
> > IIRC there can be many state IDs for idle, so it would depend on
> > number of bits encoding this state, and your needs. Example of
> > linear scale:
> > 4-bits encoding idle state and max predicted residency 10msec,
> > that means 10000us / 16 states = 625us/state.
> > The max_residency might be split differently, using different than
> > linear function, to have some rage more precised.
> >
> > Open question is if these idle states must be all represented
> > in DT, or there is a way of describing a 'set of idle states'
> > automatically.
> We only support C6 state through DT as C6 is the only deepest state for
> Tegra194 carmel CPU. WFI idle state is completely handled by kernel and does
> not require MCE sequences for entry/exit.
I think Lukasz's point is that you can encode the predicted idle time by
having multiple idle_state entries with different min_residency mapping
to the same actual idle-state. So you would several variants of C6 with
different min_residencies and if the OS picks one with longer
min_residency firmware would have a better estimate of the predicted
idle residency.
I'm not convinced it is the right way to work around passing this
information on to firmware. I would rather see an example of how well
this works (best with numbers) and have a proper solution.
Morten
prev parent reply other threads:[~2021-04-26 13:12 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-22 20:30 [RFC PATCH 0/4] Support for passing runtime state idle time to TF-A Sowjanya Komatineni
2021-04-22 20:30 ` [RFC PATCH 1/4] firmware/psci: add support for PSCI function SET_STATE_IDLE_TIME Sowjanya Komatineni
2021-04-22 20:30 ` [RFC PATCH 2/4] cpuidle: menu: add idle_time to cpuidle_state Sowjanya Komatineni
2021-04-23 12:22 ` Rafael J. Wysocki
2021-04-23 18:33 ` Sowjanya Komatineni
2021-04-22 20:30 ` [RFC PATCH 3/4] cpuidle: psci: pass state idle time before state enter callback Sowjanya Komatineni
2021-04-22 20:30 ` [RFC PATCH 4/4] arm64: dts: tegra194: Add CPU idle states Sowjanya Komatineni
2021-04-23 1:03 ` [RFC PATCH 0/4] Support for passing runtime state idle time to TF-A Sowjanya Komatineni
2021-04-23 12:27 ` Rafael J. Wysocki
2021-04-23 18:32 ` Sowjanya Komatineni
2021-04-23 20:16 ` Lukasz Luba
2021-04-23 22:24 ` Sowjanya Komatineni
2021-04-26 10:10 ` Souvik Chakravarty
2021-04-26 13:11 ` Morten Rasmussen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210426131150.GA36549@e123083-lin \
--to=morten.rasmussen@arm.com \
--cc=daniel.lezcano@linaro.org \
--cc=devicetree@vger.kernel.org \
--cc=jonathanh@nvidia.com \
--cc=ksitaraman@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-tegra@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=lukasz.luba@arm.com \
--cc=mark.rutland@arm.com \
--cc=robh+dt@kernel.org \
--cc=sanjayc@nvidia.com \
--cc=skomatineni@nvidia.com \
--cc=souvik.chakravarty@arm.com \
--cc=sudeep.holla@arm.com \
--cc=thierry.reding@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).