From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yuyang Du Subject: [RFC] A new CPU load metric for power-efficient scheduler: CPU ConCurrency Date: Fri, 25 Apr 2014 03:30:05 +0800 Message-ID: <20140424193004.GA2467@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mga03.intel.com ([143.182.124.21]:19839 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751176AbaDYDf1 (ORCPT ); Thu, 24 Apr 2014 23:35:27 -0400 Content-Disposition: inline Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: mingo@redhat.com, peterz@infradead.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org Cc: arjan.van.de.ven@intel.com, len.brown@intel.com, rafael.j.wysocki@intel.com, alan.cox@intel.com, mark.gross@intel.com, morten.rasmussen@arm.com, vincent.guittot@linaro.org, yuyang.du@intel.com Hi Ingo, PeterZ, and others, The current scheduler=E2=80=99s load balancing is completely work-conse= rving. In some workload, generally low CPU utilization but immersed with CPU bursts of transient tasks, migrating task to engage all available CPUs for work-conserving can lead to significant overhead: cache locality loss, idle/active HW state transitional latency and power, shallower idle sta= te, etc, which are both power and performance inefficient especially for to= day=E2=80=99s low power processors in mobile.=20 This RFC introduces a sense of idleness-conserving into work-conserving= (by all means, we really don=E2=80=99t want to be overwhelming in only one = way). But to what extent the idleness-conserving should be, bearing in mind that we = don=E2=80=99t want to sacrifice performance? We first need a load/idleness indicator = to that end. Thanks to CFS=E2=80=99s =E2=80=9Cmodel an ideal, precise multi-tasking = CPU=E2=80=9D, tasks can be seen as concurrently running (the tasks in the runqueue). So it is natural t= o use task concurrency as load indicator. Having said that, we do two things: 1) Divide continuous time into periods of time, and average task concur= rency in period, for tolerating the transient bursts: a =3D sum(concurrency * time) / period 2) Exponentially decay past periods, and synthesize them all, for hyste= resis to load drops or resilience to load rises (let f be decaying factor, an= d a_x the xth period average since period 0): s =3D a_n + f^1 * a_n-1 + f^2 * a_n-2 +, =E2=80=A6..,+ f^(n-1) * a_1 + = f^n * a_0 We name this load indicator as CPU ConCurrency (CC): task concurrency determines how many CPUs are needed to be running concurrently. To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3) scheduler tick, and 4) enter/exit idle. By CC, we implemented a Workload Consolidation patch on two Intel mobil= e platforms (a quad-core composed of two dual-core modules): contain load= and load balancing in the first dual-core when aggregated CC low, and if not in = the full quad-core. Results show that we got power savings and no substanti= al performance regression (even gains for some). Thanks, Yuyang