From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH] xen: credit1: on vCPU wakeup, kick away current only if makes sense Date: Mon, 2 Nov 2015 12:03:42 +0000 Message-ID: <5637511E.5030107@citrix.com> References: <20151029105742.22610.76705.stgit@Solace.station> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZtDpz-00036n-6y for xen-devel@lists.xenproject.org; Mon, 02 Nov 2015 12:03:47 +0000 In-Reply-To: <20151029105742.22610.76705.stgit@Solace.station> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xenproject.org Cc: suokun , Kun Suo List-Id: xen-devel@lists.xenproject.org On 29/10/15 10:57, Dario Faggioli wrote: > In fact, when waking up a vCPU, __runq_tickle() is called > to allow the new vCPU to run on a pCPU (which one, depends > on the relationship between the priority of the new vCPU, > and the ones of the vCPUs that are already running). > > If there is no idle processor on which the new vCPU can > run (e.g., because of pinning/affinity), we try to migrate > away the vCPU that is currently running on the new vCPU's > processor (i.e., the processor on which the vCPU is waking > up). > > Now, trying to migrate a vCPU has the effect of pushing it > through a > > running --> offline --> runnable > > transition, which, in turn, has the following negative > effects: > > 1) Credit1 counts that as a wakeup, and it BOOSTs the > vCPU, even if it is a CPU-bound one, which wouldn't > normally have deserved boosting. This can prevent > legit IO-bound vCPUs to get ahold of the processor > until such spurious boosting expires, hurting the > performance! > > 2) since the vCPU is fails the vcpu_runnable() test > (within the call to csched_schedule() that follows > the wakeup, as a consequence of tickling) the > scheduling rate-limiting mechanism is also fooled, > i.e., the context switch happens even if less than > the minimum execution amount of time passed. > > In particular, 1) has been reported to cause the following > issue: > > * VM-IO: 1-vCPU pinned to a pCPU, running netperf > * VM-CPU: 1-vCPU pinned the the same pCPU, running a busy > CPU loop > ==> Only VM-I/O: throughput is 806.64 Mbps > ==> VM-I/O + VM-CPU: throughput is 166.50 Mbps > > This patch solves (for the above scenario) the problem > by checking whether or not it makes sense to try to > migrate away the vCPU currently running on the processor. > In fact, if there aren't idle processors where such a vCPU > can execute. attempting the migration is just futile > (harmful, actually!). > > With this patch, in the above configuration, results are: > > ==> Only VM-I/O: throughput is 807.18 Mbps > ==> VM-I/O + VM-CPU: throughput is 731.66 Mbps > > Reported-by: Kun Suo > Signed-off-by: Dario Faggioli > Tested-by: Kun Suo I'm getting a bit worried about how long the path is to actually wake up a vcpu; if this only affected the "pin" case, then I might say it wasn't worth it. But it looks to me like this could be a consistent pattern on any system where there was consistently no idlers available; so at this point it's probably better to have than not: Acked-by: George Dunlap