public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Vincent Donnefort <vincent.donnefort@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Quentin Perret <qperret@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 5.12 021/134] sched/fair: Fix task utilization accountability in compute_energy()
Date: Mon,  3 May 2021 12:33:20 -0400	[thread overview]
Message-ID: <20210503163513.2851510-21-sashal@kernel.org> (raw)
In-Reply-To: <20210503163513.2851510-1-sashal@kernel.org>

From: Vincent Donnefort <vincent.donnefort@arm.com>

[ Upstream commit 0372e1cf70c28de6babcba38ef97b6ae3400b101 ]

find_energy_efficient_cpu() (feec()) computes for each perf_domain (pd) an
energy delta as follows:

  feec(task)
    for_each_pd
      base_energy = compute_energy(task, -1, pd)
        -> for_each_cpu(pd)
           -> cpu_util_next(cpu, task, -1)

      energy_delta = compute_energy(task, dst_cpu, pd)
        -> for_each_cpu(pd)
           -> cpu_util_next(cpu, task, dst_cpu)
      energy_delta -= base_energy

Then it picks the best CPU as being the one that minimizes energy_delta.

cpu_util_next() estimates the CPU utilization that would happen if the
task was placed on dst_cpu as follows:

  max(cpu_util + task_util, cpu_util_est + _task_util_est)

The task contribution to the energy delta can then be either:

  (1) _task_util_est, on a mostly idle CPU, where cpu_util is close to 0
      and _task_util_est > cpu_util.
  (2) task_util, on a mostly busy CPU, where cpu_util > _task_util_est.

  (cpu_util_est doesn't appear here. It is 0 when a CPU is idle and
   otherwise must be small enough so that feec() takes the CPU as a
   potential target for the task placement)

This is problematic for feec(), as cpu_util_next() might give an unfair
advantage to a CPU which is mostly busy (2) compared to one which is
mostly idle (1). _task_util_est being always bigger than task_util in
feec() (as the task is waking up), the task contribution to the energy
might look smaller on certain CPUs (2) and this breaks the energy
comparison.

This issue is, moreover, not sporadic. By starving idle CPUs, it keeps
their cpu_util < _task_util_est (1) while others will maintain cpu_util >
_task_util_est (2).

Fix this problem by always using max(task_util, _task_util_est) as a task
contribution to the energy (ENERGY_UTIL). The new estimated CPU
utilization for the energy would then be:

  max(cpu_util, cpu_util_est) + max(task_util, _task_util_est)

compute_energy() still needs to know which OPP would be selected if the
task would be migrated in the perf_domain (FREQUENCY_UTIL). Hence,
cpu_util_next() is still used to estimate the maximum util within the pd.

Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20210225083612.1113823-2-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/sched/fair.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 794c2cb945f8..e3c2dcb1b015 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6518,8 +6518,24 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
 	 * its pd list and will not be accounted by compute_energy().
 	 */
 	for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
-		unsigned long cpu_util, util_cfs = cpu_util_next(cpu, p, dst_cpu);
-		struct task_struct *tsk = cpu == dst_cpu ? p : NULL;
+		unsigned long util_freq = cpu_util_next(cpu, p, dst_cpu);
+		unsigned long cpu_util, util_running = util_freq;
+		struct task_struct *tsk = NULL;
+
+		/*
+		 * When @p is placed on @cpu:
+		 *
+		 * util_running = max(cpu_util, cpu_util_est) +
+		 *		  max(task_util, _task_util_est)
+		 *
+		 * while cpu_util_next is: max(cpu_util + task_util,
+		 *			       cpu_util_est + _task_util_est)
+		 */
+		if (cpu == dst_cpu) {
+			tsk = p;
+			util_running =
+				cpu_util_next(cpu, p, -1) + task_util_est(p);
+		}
 
 		/*
 		 * Busy time computation: utilization clamping is not
@@ -6527,7 +6543,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
 		 * is already enough to scale the EM reported power
 		 * consumption at the (eventually clamped) cpu_capacity.
 		 */
-		sum_util += effective_cpu_util(cpu, util_cfs, cpu_cap,
+		sum_util += effective_cpu_util(cpu, util_running, cpu_cap,
 					       ENERGY_UTIL, NULL);
 
 		/*
@@ -6537,7 +6553,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
 		 * NOTE: in case RT tasks are running, by default the
 		 * FREQUENCY_UTIL's utilization can be max OPP.
 		 */
-		cpu_util = effective_cpu_util(cpu, util_cfs, cpu_cap,
+		cpu_util = effective_cpu_util(cpu, util_freq, cpu_cap,
 					      FREQUENCY_UTIL, tsk);
 		max_util = max(max_util, cpu_util);
 	}
-- 
2.30.2


  parent reply	other threads:[~2021-05-03 16:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-03 16:33 [PATCH AUTOSEL 5.12 001/134] drm: Added orientation quirk for OneGX1 Pro Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 002/134] drm/qxl: do not run release if qxl failed to init Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 003/134] drm/qxl: release shadow on shutdown Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 004/134] drm/ast: Fix invalid usage of AST_MAX_HWC_WIDTH in cursor atomic_check Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 005/134] drm/amd/display: changing sr exit latency Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 006/134] drm/amd/display: Fix MPC OGAM power on/off sequence Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 007/134] drm/amd/pm: do not issue message while write "r" into pp_od_clk_voltage Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 008/134] drm/ast: fix memory leak when unload the driver Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 009/134] drm/amd/display: Check for DSC support instead of ASIC revision Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 010/134] drm/amd/display: Don't optimize bandwidth before disabling planes Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 011/134] drm/amd/display: Return invalid state if GPINT times out Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 012/134] drm/amdgpu/display: buffer INTERRUPT_LOW_IRQ_CONTEXT interrupt work Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 013/134] drm/amd/display/dc/dce/dce_aux: Remove duplicate line causing 'field overwritten' issue Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 014/134] scsi: lpfc: Fix incorrect dbde assignment when building target abts wqe Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 015/134] scsi: lpfc: Fix pt2pt connection does not recover after LOGO Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 016/134] scsi: lpfc: Fix status returned in lpfc_els_retry() error exit path Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 017/134] scsi: lpfc: Fix PLOGI ACC to be transmit after REG_LOGIN Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 018/134] scsi: lpfc: Fix ADISC handling that never frees nodes Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 019/134] drm/amd/pm/swsmu: clean up user profile function Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 020/134] drm/amdgpu: Fix some unload driver issues Sasha Levin
2021-05-03 16:33 ` Sasha Levin [this message]
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 022/134] sched/pelt: Fix task util_est update filtering Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 023/134] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2 Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 024/134] kvfree_rcu: Use same set of GFP flags as does single-argument Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 025/134] drm/virtio: fix possible leak/unlock virtio_gpu_object_array Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 026/134] scsi: target: pscsi: Fix warning in pscsi_complete_cmd() Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 027/134] media: ite-cir: check for receive overflow Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 028/134] media: drivers: media: pci: sta2x11: fix Kconfig dependency on GPIOLIB Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 029/134] media: drivers/media/usb: fix memory leak in zr364xx_probe Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 030/134] media: cx23885: add more quirks for reset DMA on some AMD IOMMU Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 031/134] media: imx: capture: Return -EPIPE from __capture_legacy_try_fmt() Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 032/134] atomisp: don't let it go past pipes array Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 033/134] power: supply: bq27xxx: fix power_avg for newer ICs Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 034/134] extcon: arizona: Fix some issues when HPDET IRQ fires after the jack has been unplugged Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 035/134] extcon: arizona: Fix various races on driver unbind Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 036/134] media: venus: core, venc, vdec: Fix probe dependency error Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 037/134] s390/qdio: let driver manage the QAOB Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 038/134] media: media/saa7164: fix saa7164_encoder_register() memory leak bugs Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 039/134] media: gspca/sq905.c: fix uninitialized variable Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 040/134] media: v4l2-ctrls.c: initialize flags field of p_fwht_params Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 041/134] media: pci: saa7164: Rudimentary spelling fixes in the file saa7164-types.h Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 042/134] power: supply: Use IRQF_ONESHOT Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 043/134] backlight: qcom-wled: Use sink_addr for sync toggle Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 044/134] backlight: qcom-wled: Fix FSC update issue for WLED5 Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 045/134] drm/bridge/analogix/anx78xx: Setup encoder before registering connector Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 046/134] drm/bridge/analogix/anx78xx: Cleanup on error in anx78xx_bridge_attach() Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 047/134] drm/amdgpu: enable retry fault wptr overflow Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 048/134] drm/amdgpu: enable 48-bit IH timestamp counter Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 049/134] drm/amdgpu: mask the xgmi number of hops reported from psp to kfd Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 050/134] drm/amdkfd: Fix UBSAN shift-out-of-bounds warning Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 051/134] drm/amd/display: Align cursor cache address to 2KB Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 052/134] drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac7c4f Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 053/134] drm/amd/pm: fix workload mismatch on vega10 Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 054/134] drm/amd/display: Fix UBSAN warning for not a valid value for type '_Bool' Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 055/134] drm/amd/display: DCHUB underflow counter increasing in some scenarios Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 056/134] drm/amd/display: fix dml prefetch validation Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 057/134] drm/amd/display: Fix potential memory leak Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 058/134] drm/amdgpu: Fix " Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 059/134] scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats() Sasha Levin
2021-05-03 16:33 ` [PATCH AUTOSEL 5.12 060/134] drm/vkms: fix misuse of WARN_ON Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 061/134] block, bfq: fix weight-raising resume with !low_latency Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 062/134] scsi: qla2xxx: Fix use after free in bsg Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 063/134] mmc: sdhci-esdhc-imx: validate pinctrl before use it Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 064/134] mmc: sdhci-pci: Add PCI IDs for Intel LKF Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 065/134] mmc: sdhci-brcmstb: Remove CQE quirk Sasha Levin
2021-05-03 16:34 ` [PATCH AUTOSEL 5.12 066/134] ata: ahci: Disable SXS for Hisilicon Kunpeng920 Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210503163513.2851510-21-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vincent.donnefort@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox