From: Lukasz Luba <lukasz.luba@arm.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
dietmar.eggemann@arm.com, amit.kucheria@verdurent.com,
amit.kachhap@gmail.com, daniel.lezcano@linaro.org,
viresh.kumar@linaro.org, pavel@ucw.cz, mhiramat@kernel.org,
qyousef@layalina.io, wvw@google.com,
Morten Rasmussen <morten.rasmussen@arm.com>,
Ionela Voinescu <ionela.voinescu@arm.com>,
Beata Michalska <Beata.Michalska@arm.com>,
Sumit Gupta <sumitg@nvidia.com>
Subject: Re: [PATCH v5 00/23] Introduce runtime modifiable Energy Model
Date: Wed, 13 Dec 2023 09:32:14 +0000 [thread overview]
Message-ID: <0c3f04cd-ea37-4308-b3f4-511562fa539e@arm.com> (raw)
In-Reply-To: <CAJZ5v0gqAGv50MK9oAVW=+w+7unavQ86cHF6zmyJjmVvNYjzUg@mail.gmail.com>
Hi Rafael,
Thank you for having a loot at the series.
On 12/12/23 18:49, Rafael J. Wysocki wrote:
> Hi Lukasz,
>
> On Wed, Nov 29, 2023 at 12:08 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi all,
>>
>> This patch set adds a new feature which allows to modify Energy Model (EM)
>> power values at runtime. It will allow to better reflect power model of
>> a recent SoCs and silicon. Different characteristics of the power usage
>> can be leveraged and thus better decisions made during task placement in EAS.
>>
>> It's part of feature set know as Dynamic Energy Model. It has been presented
>> and discussed recently at OSPM2023 [3]. This patch set implements the 1st
>> improvement for the EM.
>>
>> The concepts:
>> 1. The CPU power usage can vary due to the workload that it's running or due
>> to the temperature of the SoC. The same workload can use more power when the
>> temperature of the silicon has increased (e.g. due to hot GPU or ISP).
>> In such situation the EM can be adjusted and reflect the fact of increased
>> power usage. That power increase is due to static power
>> (sometimes called simply: leakage). The CPUs in recent SoCs are different.
>> We have heterogeneous SoCs with 3 (or even 4) different microarchitectures.
>> They are also built differently with High Performance (HP) cells or
>> Low Power (LP) cells. They are affected by the temperature increase
>> differently: HP cells have bigger leakage. The SW model can leverage that
>> knowledge.
>>
>> 2. It is also possible to change the EM to better reflect the currently
>> running workload. Usually the EM is derived from some average power values
>> taken from experiments with benchmark (e.g. Dhrystone). The model derived
>> from such scenario might not represent properly the workloads usually running
>> on the device. Therefore, runtime modification of the EM allows to switch to
>> a different model, when there is a need.
>>
>> 3. The EM can be adjusted after boot, when all the modules are loaded and
>> more information about the SoC is available e.g. chip binning. This would help
>> to better reflect the silicon characteristics. Thus, this EM modification
>> API allows it now. It wasn't possible in the past and the EM had to be
>> 'set in stone'.
>>
>> More detailed explanation and background can be found in presentations
>> during LPC2022 [1][2] or in the documentation patches.
>>
>> Some test results.
>> The EM can be updated to fit better the workload type. In the case below the EM
>> has been updated for the Jankbench test on Pixel6 (running v5.18 w/ mainline backports
>> for the scheduler bits). The Jankbench was run 10 times for those two configurations,
>> to get more reliable data.
>>
>> 1. Janky frames percentage
>> +--------+-----------------+---------------------+-------+-----------+
>> | metric | variable | kernel | value | perc_diff |
>> +--------+-----------------+---------------------+-------+-----------+
>> | gmean | jank_percentage | EM_default | 2.0 | 0.0% |
>> | gmean | jank_percentage | EM_modified_runtime | 1.3 | -35.33% |
>> +--------+-----------------+---------------------+-------+-----------+
>>
>> 2. Avg frame render time duration
>> +--------+---------------------+---------------------+-------+-----------+
>> | metric | variable | kernel | value | perc_diff |
>> +--------+---------------------+---------------------+-------+-----------+
>> | gmean | mean_frame_duration | EM_default | 10.5 | 0.0% |
>> | gmean | mean_frame_duration | EM_modified_runtime | 9.6 | -8.52% |
>> +--------+---------------------+---------------------+-------+-----------+
>>
>> 3. Max frame render time duration
>> +--------+--------------------+---------------------+-------+-----------+
>> | metric | variable | kernel | value | perc_diff |
>> +--------+--------------------+---------------------+-------+-----------+
>> | gmean | max_frame_duration | EM_default | 251.6 | 0.0% |
>> | gmean | max_frame_duration | EM_modified_runtime | 115.5 | -54.09% |
>> +--------+--------------------+---------------------+-------+-----------+
>>
>> 4. OS overutilized state percentage (when EAS is not working)
>> +--------------+---------------------+------+------------+------------+
>> | metric | wa_path | time | total_time | percentage |
>> +--------------+---------------------+------+------------+------------+
>> | overutilized | EM_default | 1.65 | 253.38 | 0.65 |
>> | overutilized | EM_modified_runtime | 1.4 | 277.5 | 0.51 |
>> +--------------+---------------------+------+------------+------------+
>>
>> 5. All CPUs (Little+Mid+Big) power values in mW
>> +------------+--------+---------------------+-------+-----------+
>> | channel | metric | kernel | value | perc_diff |
>> +------------+--------+---------------------+-------+-----------+
>> | CPU | gmean | EM_default | 142.1 | 0.0% |
>> | CPU | gmean | EM_modified_runtime | 131.8 | -7.27% |
>> +------------+--------+---------------------+-------+-----------+
>>
>> The time cost to update the EM decreased in this v5 vs v4:
>> big: 5us vs 2us -> 2.6x faster
>> mid: 9us vs 3us -> 3x faster
>> little: 16us vs 16us -> no change
>>
>> We still have to update the inefficiency in the cpufreq framework, thus
>> a bit of overhead will be there.
>>
>> Changelog:
>> v5:
>> - removed 2 tables design
>> - have only one table (runtime_table) used also in thermal (Wei, Rafael)
>> - refactored update function and removed callback call for each opp
>> - added faster EM table swap, using only the RCU pointer update
>> - added memory allocation API and tracking with kref
>> - avoid overhead for computing 'cost' for each OPP in update, it can be
>> pre-computed in device drivers EM earlier
>> - add support for device drivers providing EM table
>> - added API for computing 'cost' values in EM for EAS
>> - added API for thermal/powercap to use EM (using RCU wrappers)
>> - switched to single allocation and 'state[]' array (Rafael)
>> - changed documentation to align with current design
>> - added helper API for computing cost values
>> - simplified EM free in unregister path (thanks to kref)
>> - split patch updating EM clients and changed them separetly
>> - added seperate patch removing old static EM table
>> - added EM debugfs change patch to dump the runtime_table
>> - addressed comments in v4 for spelling/comments/headers
>> - added review tags
>
> I like this one more than the previous one and thanks for taking my
> feedback into account.
>
> I would still like other people having a vested interest in the EM to
> look at it and give feedback (or just tags), so I'm not inclined to
> apply it just yet. However, I don't have any specific comments on it.
Let me contact offline some of the partners who were keen to have this
in mainline (when I presented some first implementation in 2021 at
Android kernel review systems).
Regards,
Lukasz
next prev parent reply other threads:[~2023-12-13 9:31 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-29 11:08 [PATCH v5 00/23] Introduce runtime modifiable Energy Model Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 01/23] PM: EM: Add missing newline for the message log Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 02/23] PM: EM: Refactor em_cpufreq_update_efficiencies() arguments Lukasz Luba
2023-12-17 17:58 ` Qais Yousef
2023-12-19 10:30 ` Lukasz Luba
2023-12-28 16:59 ` Qais Yousef
2024-01-02 9:40 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 03/23] PM: EM: Find first CPU active while updating OPP efficiency Lukasz Luba
2023-12-17 17:58 ` Qais Yousef
2023-12-19 10:53 ` Lukasz Luba
2023-12-28 17:13 ` Qais Yousef
2024-01-02 9:42 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 04/23] PM: EM: Refactor em_pd_get_efficient_state() to be more flexible Lukasz Luba
2023-12-12 18:49 ` Dietmar Eggemann
2023-12-19 10:58 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 05/23] PM: EM: Refactor a new function em_compute_costs() Lukasz Luba
2023-12-17 17:58 ` Qais Yousef
2023-12-19 10:59 ` Lukasz Luba
2023-12-28 17:14 ` Qais Yousef
2024-01-02 9:43 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 06/23] PM: EM: Check if the get_cost() callback is present in em_compute_costs() Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 07/23] PM: EM: Refactor how the EM table is allocated and populated Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-19 13:19 ` Lukasz Luba
2023-12-17 17:59 ` Qais Yousef
2023-11-29 11:08 ` [PATCH v5 08/23] PM: EM: Introduce runtime modifiable table Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-19 11:33 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 09/23] PM: EM: Use runtime modified EM for CPUs energy estimation in EAS Lukasz Luba
2023-12-17 17:59 ` Qais Yousef
2023-12-19 4:03 ` Xuewen Yan
2023-12-19 8:32 ` Lukasz Luba
2023-12-28 17:32 ` Qais Yousef
2024-01-02 11:17 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 10/23] PM: EM: Add API for memory allocations for new tables Lukasz Luba
2023-12-17 17:59 ` Qais Yousef
2023-12-19 8:45 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 11/23] PM: EM: Add API for updating the runtime modifiable EM Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-20 8:06 ` Lukasz Luba
2024-01-04 15:45 ` Dietmar Eggemann
2024-01-04 16:55 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 12/23] PM: EM: Add helpers to read under RCU lock the EM table Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 13/23] PM: EM: Add performance field to struct em_perf_state Lukasz Luba
2023-12-17 18:00 ` Qais Yousef
2023-12-20 8:21 ` Lukasz Luba
2023-12-28 17:45 ` Qais Yousef
2023-11-29 11:08 ` [PATCH v5 14/23] PM: EM: Support late CPUs booting and capacity adjustment Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-20 8:23 ` Lukasz Luba
2023-12-17 18:00 ` Qais Yousef
2024-01-02 11:39 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 15/23] PM: EM: Optimize em_cpu_energy() and remove division Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-20 8:42 ` Lukasz Luba
2024-01-04 16:30 ` Dietmar Eggemann
2024-01-04 16:56 ` Lukasz Luba
2023-12-28 18:06 ` Qais Yousef
2024-01-02 11:47 ` Lukasz Luba
2024-01-04 19:23 ` Qais Yousef
2024-01-10 13:53 ` Lukasz Luba
2024-01-15 12:21 ` Qais Yousef
2024-01-15 12:36 ` Lukasz Luba
2024-01-16 13:10 ` Qais Yousef
2024-01-16 15:34 ` Lukasz Luba
2024-01-16 19:33 ` Qais Yousef
2023-11-29 11:08 ` [PATCH v5 16/23] powercap/dtpm_cpu: Use new Energy Model interface to get table Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 17/23] powercap/dtpm_devfreq: " Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 18/23] drivers/thermal/cpufreq_cooling: Use new Energy Model interface Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 19/23] drivers/thermal/devfreq_cooling: " Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 20/23] PM: EM: Change debugfs configuration to use runtime EM table data Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 21/23] PM: EM: Remove old table Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 22/23] PM: EM: Add em_dev_compute_costs() as API for device drivers Lukasz Luba
2023-12-12 18:50 ` Dietmar Eggemann
2023-12-17 18:03 ` Qais Yousef
2023-12-18 11:56 ` Lukasz Luba
2023-12-20 11:14 ` Lukasz Luba
2023-11-29 11:08 ` [PATCH v5 23/23] Documentation: EM: Update with runtime modification design Lukasz Luba
2023-12-12 18:51 ` Dietmar Eggemann
2023-12-19 9:35 ` Lukasz Luba
2023-12-19 4:42 ` Xuewen Yan
2023-12-19 8:47 ` Lukasz Luba
2023-12-19 6:22 ` Xuewen Yan
2023-12-19 9:32 ` Lukasz Luba
2023-12-20 2:08 ` Xuewen Yan
2023-12-20 7:57 ` Lukasz Luba
2023-12-12 18:48 ` [PATCH v5 00/23] Introduce runtime modifiable Energy Model Dietmar Eggemann
2023-12-13 9:23 ` Lukasz Luba
2023-12-13 11:34 ` Dietmar Eggemann
2023-12-13 11:45 ` Rafael J. Wysocki
2023-12-13 12:20 ` Lukasz Luba
2023-12-12 18:49 ` Rafael J. Wysocki
2023-12-13 9:32 ` Lukasz Luba [this message]
2023-12-13 13:40 ` Hongyan Xia
2023-12-13 13:16 ` Lukasz Luba
2023-12-17 18:22 ` Qais Yousef
2023-12-19 10:22 ` Lukasz Luba
2023-12-28 18:41 ` Qais Yousef
2024-01-02 12:12 ` Lukasz Luba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0c3f04cd-ea37-4308-b3f4-511562fa539e@arm.com \
--to=lukasz.luba@arm.com \
--cc=Beata.Michalska@arm.com \
--cc=amit.kachhap@gmail.com \
--cc=amit.kucheria@verdurent.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=ionela.voinescu@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=pavel@ucw.cz \
--cc=qyousef@layalina.io \
--cc=rafael@kernel.org \
--cc=sumitg@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=wvw@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox