From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5D6AD2231F; Thu, 4 Jan 2024 16:55:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A241DC15; Thu, 4 Jan 2024 08:55:59 -0800 (PST) Received: from [10.57.88.128] (unknown [10.57.88.128]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7932E3F64C; Thu, 4 Jan 2024 08:55:11 -0800 (PST) Message-ID: <22c8d702-dc11-4e25-bb2d-0d29b0481991@arm.com> Date: Thu, 4 Jan 2024 16:56:30 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5 15/23] PM: EM: Optimize em_cpu_energy() and remove division Content-Language: en-US To: Dietmar Eggemann Cc: rui.zhang@intel.com, amit.kucheria@verdurent.com, rafael@kernel.org, linux-kernel@vger.kernel.org, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, linux-pm@vger.kernel.org References: <20231129110853.94344-1-lukasz.luba@arm.com> <20231129110853.94344-16-lukasz.luba@arm.com> <52655f7d-4056-42eb-a3c4-1eb8e21ea259@arm.com> From: Lukasz Luba In-Reply-To: <52655f7d-4056-42eb-a3c4-1eb8e21ea259@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/4/24 16:30, Dietmar Eggemann wrote: > On 20/12/2023 09:42, Lukasz Luba wrote: >> >> >> On 12/12/23 18:50, Dietmar Eggemann wrote: >>> On 29/11/2023 12:08, Lukasz Luba wrote: > > [...] > >>>> With this optimization, the em_cpu_energy() should run faster on the Big >>>> CPU by 1.43x and on the Little CPU by 1.69x. >>> >>> Where are those precise numbers are coming from? Which platform was it? >> >> That was mainline big.Little board rockpi4 b w/ rockchip 3399, present > > IMHO, you should mention the platform here so people don't wonder. > >> quite a few commercial devices (e.g. chromebooks or plenty other seen in >> DT). The numbers are from measuring the time it takes to run this >> function em_cpu_cost() in a loop for mln of times. Thus, the instruction >> cache and data cache should be hot, but the operation would impact the >> different score. > > [...] > >>> Can you not keep the existing comment and only change: >>> >>> (a) that ps->cap id ps->performance in (2) and >>> >>> (b) that: >>> >>>            *             ps->power * cpu_max_freq   cpu_util >>>            *   cpu_nrg = ------------------------ * ---------     (3) >>>            *                    ps->freq            scale_cpu >>> >>>                          <---- (old) ps->cost ---> >>> >>>      is now >>> >>>                  ps->power * cpu_max_freq       1 >>>      ps-> cost = ------------------------ * ---------- >>>                          ps->freq            scale_cpu >>> >>>                  <---- (old) ps->cost ---> >>> >>> and (c) that (4) has changed to: >>> >>>           *   pd_nrg = ps->cost * \Sum cpu_util                   (4) >>> >>> which avoid the division? >>> >>> Less changes is always much nicer since it makes it so much easier to >>> detect history and review changes. >> >> I'm open to change that, but I will have to contact you offline >> what you mean. This comment section in code is really tricky to >> handle right. > > OK, the changes you showed me offline LGTM. > > [...] > All good then. Thank you for the comments. I'll send v6.