From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: [Patch 2 of 2]: PV-domain SMP performance Linux-part Date: Tue, 20 Jan 2009 12:12:04 -0800 Message-ID: <49763014.6050705@goop.org> References: <4970C6D3.2080206@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: George Dunlap Cc: "xen-devel@lists.xensource.com" , Juergen Gross , Keir Fraser List-Id: xen-devel@lists.xenproject.org George Dunlap wrote: > On Fri, Jan 16, 2009 at 5:41 PM, Jeremy Fitzhardinge wrote: > >> Yes, that's more or less right. Each lock has a count of how many cpus are >> waiting for the lock; if its non-zero on unlock, the unlocker kicks all the >> waiting cpus via IPI. There's a per-cpu variable of "lock I am waiting >> for"; the kicker looks at each cpu's entry and kicks it if its waiting for >> the lock being unlocked. >> >> The locking side does the expected "spin for a while, then block on >> timeout". The timeout is settable if you have the appropriate debugfs >> option enabled (which also produces quite a lot of detailed stats about >> locking behaviour). The IPI is never delivered as an event BTW; the locker >> uses the event poll hypercall to block until the event is pending (this >> hypercall had some performance problems until relatively recent versions of >> Xen; I'm not sure which release versions has the fix). >> >> The lock itself is a simple byte spinlock, with no fairness guarantees; I'm >> assuming (hoping) that the pathological cases that ticket locks were >> introduced to solve will be mitigated by the timeout/blocking path (and/or >> less likely in a virtual environment anyway). >> >> I measured a small performance improvement within the domain with this patch >> (kernbench-type workload), but an overall 10% reduction in system-wide CPU >> use with multiple competing domains. >> > > This is in the pv-ops kernel; is it in the Xen 2.6.18 kernel yet? > Yes. No plans to backport. > Another thing to consider is how the approach applies to a related > problem, that of "syncronous" IPI function calls: i.e., when v0 sends > an IPI to v1 to do something, and spins waiting for it to be done, > expecting it to be finished pretty quickly. But v1 is over credits, > so it doesn't get to run, and v0 burns its credits waiting. > Yes. Some kind of direct yield might work in that case. In practice it hasn't been a huge problem in Linux because most synchronous IPIs are for cross-cpu TLB flushes, which we use a hypercall for anyway. J