From mboxrd@z Thu Jan 1 00:00:00 1970 From: Srivatsa Vaddagiri Subject: Re: [PATCH RFC V6 0/11] Paravirtualized ticketlocks Date: Sat, 31 Mar 2012 09:37:45 +0530 Message-ID: <20120331040745.GC14030@linux.vnet.ibm.com> References: <20120321102041.473.61069.sendpatchset@codeblue.in.ibm.com> <4F7616F5.4070000@zytor.com> Reply-To: Srivatsa Vaddagiri Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Thomas Gleixner Cc: the arch/x86 maintainers , KVM , Konrad Rzeszutek Wilk , Peter Zijlstra , Stefano Stabellini , Raghavendra K T , LKML , Andi Kleen , Avi Kivity , Jeremy Fitzhardinge , "H. Peter Anvin" , Attilio Rao , Ingo Molnar , Virtualization , Linus Torvalds , Xen Devel , Stephan Diestelhorst List-Id: virtualization@lists.linuxfoundation.org * Thomas Gleixner [2012-03-31 00:07:58]: > I know that Peter is going to go berserk on me, but if we are running > a paravirt guest then it's simple to provide a mechanism which allows > the host (aka hypervisor) to check that in the guest just by looking > at some global state. > > So if a guest exits due to an external event it's easy to inspect the > state of that guest and avoid to schedule away when it was interrupted > in a spinlock held section. That guest/host shared state needs to be > modified to indicate the guest to invoke an exit when the last nested > lock has been released. I had attempted something like that long back: http://lkml.org/lkml/2010/6/3/4 The issue is with ticketlocks though. VCPUs could go into a spin w/o a lock being held by anybody. Say VCPUs 1-99 try to grab a lock in that order (on a host with one cpu). VCPU1 wins (after VCPU0 releases it) and releases the lock. VCPU1 is next eligible to take the lock. If that is not scheduled early enough by host, then remaining vcpus would keep spinning (even though lock is technically not held by anybody) w/o making forward progress. In that situation, what we really need is for the guest to hint to host scheduler to schedule VCPU1 early (via yield_to or something similar). The current pv-spinlock patches however does not track which vcpu is spinning at what head of the ticketlock. I suppose we can consider that optimization in future and see how much benefit it provides (over plain yield/sleep the way its done now). Do you see any issues if we take in what we have today and address the finer-grained optimization as next step? - vatsa