From mboxrd@z Thu Jan 1 00:00:00 1970 From: Malcolm Crossley Subject: Re: [PATCH 2/2] grant_table: convert grant table rwlock to percpu rwlock Date: Thu, 19 Nov 2015 09:03:25 +0000 Message-ID: <564D905D.4080005@citrix.com> References: <1446573502-8019-1-git-send-email-malcolm.crossley@citrix.com> <1446573502-8019-2-git-send-email-malcolm.crossley@citrix.com> <564B6C1A02000078000B603C@prv-mh.provo.novell.com> <564B6453.6050008@citrix.com> <20151118200211.GE1762@char.us.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZzL8C-0004Ju-Gr for xen-devel@lists.xenproject.org; Thu, 19 Nov 2015 09:03:56 +0000 In-Reply-To: <20151118200211.GE1762@char.us.oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk , Andrew Cooper Cc: xen-devel@lists.xenproject.org, stefano.stabellini@citrix.com, keir@xen.org, ian.campbell@citrix.com, Jan Beulich List-Id: xen-devel@lists.xenproject.org On 18/11/15 20:02, Konrad Rzeszutek Wilk wrote: > On Tue, Nov 17, 2015 at 05:30:59PM +0000, Andrew Cooper wrote: >> On 17/11/15 17:04, Jan Beulich wrote: >>>>>> On 03.11.15 at 18:58, wrote: >>>> --- a/xen/common/grant_table.c >>>> +++ b/xen/common/grant_table.c >>>> @@ -178,6 +178,10 @@ struct active_grant_entry { >>>> #define _active_entry(t, e) \ >>>> ((t)->active[(e)/ACGNT_PER_PAGE][(e)%ACGNT_PER_PAGE]) >>>> >>>> +bool_t grant_rwlock_barrier; >>>> + >>>> +DEFINE_PER_CPU(rwlock_t *, grant_rwlock); >>> Shouldn't these be per grant table? And wouldn't doing so eliminate >>> the main limitation of the per-CPU rwlocks? >> >> The grant rwlock is per grant table. >> >> The entire point of this series is to reduce the cmpxchg storm which >> happens when many pcpus attempt to grap the same domains grant read lock. >> >> As identified in the commit message, reducing the cmpxchg pressure on >> the cache coherency fabric increases intra-vm network through from >> 10Gbps to 50Gbps when running iperf between two 16-vcpu guests. >> >> Or in other words, 80% of cpu time is wasted with waiting on an atomic >> read/modify/write operation against a remote hot cache line. >> > > Why not use MCE locks then (in Linux the implemention is known > as qspinlock). Plus they have added extra code to protect against > recursion (via four levels). See Linux commit > a33fda35e3a7655fb7df756ed67822afb5ed5e8d > locking/qspinlock: Introduce a simple generic 4-byte queued spinlock) > The Linux qspinlock is MCS based but MCS only helps under lock contention. It still uses a single data location for the lock and so suffers from cache line bouncing plus the cmpxchg overhead for taking a uncontended lock. You can see the qspinlock using the cmpxchg mechanism here: http://lxr.free-electrons.com/source/include/asm-generic/qspinlock.h#L62 I've copy pasted the qspinlock lock implementation inline for convenience: static __always_inline void queued_spin_lock(struct qspinlock *lock) { u32 val; val = atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL); if (likely(val == 0)) return; queued_spin_lock_slowpath(lock, val); } Malcolm >> ~Andrew >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel