From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joerg Roedel Subject: Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting Date: Sun, 27 Sep 2009 16:53:38 +0200 Message-ID: <20090927145338.GF29634@8bytes.org> References: <4ABA2AD7.6080008@intel.com> <4ABA2C22.7020000@redhat.com> <20090925204339.GA29634@8bytes.org> <4ABF22D9.3040308@redhat.com> <20090927134650.GC29634@8bytes.org> <4ABF6D0B.8080603@redhat.com> <20090927140752.GD29634@8bytes.org> <4ABF7418.2000404@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Zhai, Edwin" , Ingo Molnar , "kvm@vger.kernel.org" To: Avi Kivity Return-path: Received: from 8bytes.org ([88.198.83.132]:41160 "EHLO 8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752499AbZI0Oxf (ORCPT ); Sun, 27 Sep 2009 10:53:35 -0400 Content-Disposition: inline In-Reply-To: <4ABF7418.2000404@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Sep 27, 2009 at 04:18:00PM +0200, Avi Kivity wrote: > On 09/27/2009 04:07 PM, Joerg Roedel wrote: >> On Sun, Sep 27, 2009 at 03:47:55PM +0200, Avi Kivity wrote: >> >>> On 09/27/2009 03:46 PM, Joerg Roedel wrote: >>> >>>> >>>>> We can't find exactly which vcpu, but we can: >>>>> >>>>> - rule out threads that are not vcpus for this guest >>>>> - rule out threads that are already running >>>>> >>>>> A major problem with sleep() is that it effectively reduces the vm >>>>> priority relative to guests that don't have spinlock contention. By >>>>> selecting a random nonrunnable vcpu belonging to this guest, we at least >>>>> preserve the guest's timeslice. >>>>> >>>>> >>>> Ok, that makes sense. But before trying that we should probably try to >>>> call just yield() instead of schedule()? I remember someone from our >>>> team here at AMD did this for Xen a while ago and already had pretty >>>> good results with that. Xen has a completly other scheduler but maybe >>>> its worth trying? >>>> >>>> >>> yield() is a no-op in CFS. >>> >> Hmm, true. At least when kernel.sched_compat_yield == 0, which it is on my >> distro. >> If the scheduler would give us something like a real_yield() function >> which asumes kernel.sched_compat_yield = 1 might help. At least its >> better than sleeping for some random amount of time. >> >> > > Depends. If it's a global yield(), yes. If it's a local yield() that > doesn't rebalance the runqueues we might be left with the spinning task > re-running. Only one runable task on each cpu is unlikely in a situation of high vcpu overcommit (where pause filtering matters). > Also, if yield means "give up the reminder of our timeslice", then we > potentially end up sleeping a much longer random amount of time. If we > yield to another vcpu in the same guest we might not care, but if we > yield to some other guest we're seriously penalizing ourselves. I agree that a directed yield with possible rebalance would be good to have, but this is very intrusive to the scheduler code and I think we should at least try if this simpler approach already gives us good results. Joerg