From mboxrd@z Thu Jan  1 00:00:00 1970
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Subject: Re: Paravirtualized pause loop handling
Date: Thu, 20 Sep 2012 13:12:29 +0530
Message-ID: <505AC8E5.8060206@linux.vnet.ibm.com>
References: <CAJocwcf9MiXD3J5jRvbsN76mpCoFJPGQBiVGittb-1Jch8WwOQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org
To: Jiannan Ouyang <ouyang@cs.pitt.edu>
Return-path: <kvm-owner@vger.kernel.org>
Received: from e23smtp07.au.ibm.com ([202.81.31.140]:39164 "EHLO
	e23smtp07.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753531Ab2ITHqJ (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 20 Sep 2012 03:46:09 -0400
Received: from /spool/local
	by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted
	for <kvm@vger.kernel.org> from <raghavendra.kt@linux.vnet.ibm.com>;
	Thu, 20 Sep 2012 17:43:55 +1000
Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139])
	by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q8K7agFH16646336
	for <kvm@vger.kernel.org>; Thu, 20 Sep 2012 17:36:43 +1000
Received: from d23av04.au.ibm.com (loopback [127.0.0.1])
	by d23av04.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q8K7jxNl004519
	for <kvm@vger.kernel.org>; Thu, 20 Sep 2012 17:45:59 +1000
In-Reply-To: <CAJocwcf9MiXD3J5jRvbsN76mpCoFJPGQBiVGittb-1Jch8WwOQ@mail.gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 09/13/2012 02:48 AM, Jiannan Ouyang wrote:
> Hi Raghu,
>
> I'm working on improving paravirtualized spinlock performance for a
> while, with my past findings, I come up with a new idea to make the
> pause-loop handler more efficient.
>
> Our original idea is to expose vmm scheduling information to the
> guest, so lock requester can sleep/yield upon lock holder been
> scheduled out, instead of spinning for SPIN_THRESHOLD loops. However,
> as I moving forward, I found that the problems of this approach are
> - saving from SPIN_THRESHOLD is only few microseconds

we try to set SPIN_THRESHOLD to an optimal value (typical lock-holding 
time). If we are spinning more that, that would ideally mean LHP case.
But I agree having a good SPIN_THRESHOLD is little tricky.

> - yields to another CPU is not efficient because it will only come
> back after few ms, 1000x times more than normal lock waiting time

No. It is efficient if we are able to refine the candidate vcpus to
yield_to. But it is tricky to find good guy too.
Here was one successful attempt.
https://lkml.org/lkml/2012/7/18/247

> - sleep upon lock holder preemption make sense, but that has been done
> very well by your pv_lock patch
>
> Below is some data I got
> - 4 core guest x2 on 4 core host
> - guest1: hackbench 10 run average completion time, lower is better
> - guest2: 4 process while true
>
>                                        Average(s)   Stdev
> Native                             8.6739         0.51965
> Stock kernel -ple             84.1841       17.37156
> + ple                              80.6322        27.6574
> + cpu binding                  25.6569       1.93028
> + pv_lock                       17.8462        0.74884
> + cpu binding&  pv_lock  16.9935        0.772416
>
> Observations are:
> - improvement from ple (4s) is much less than pv_lock and cpu_binding (60s~)
> - best performance comes from pv_lock with cpu_binding, which bind
> 4vcpu to four physical core. Idea from (1)
>

Results are interesting. I am trying out V9 with all the improvements, 
took place after V8.

> Then I came up with the "paravirtualized pause-loop exit" idea.
> Current vcpu boosting strategy upon ple is not very efficient, because
> 1) it may boost the wrong vcpu, 2) time for the lock holder to come
> back is very likely to be few ms, much longer than normal lock waiting
> time, few us.
>
> What we can do is expose guest lock waiting information to VMM, and
> upon ple, the vmm can make vcpu to sleep on the lock holder's wait
> queue. Later we can wake them up, when the lock holder is scheduled
> in. Or take one stop further, make a vcpu sleep previous ticket
> holder's wait queue, thus we ensure the order the wake up.
>

This is very interesting. Can you share the patches?

> I'm almost done with the implementation, expect some testing work. Any
> comments or suggestions?
>
> Thanks
> --Jiannan
>
> Reference
> (1) Is co-scheduling to expensive for smp vms? O. Sukwong, H. S. Kim, EuroSys 11