From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 2/2] grant_table: convert grant table rwlock to percpu rwlock Date: Tue, 17 Nov 2015 17:53:44 +0000 Message-ID: <564B69A8.6050609@citrix.com> References: <1446573502-8019-1-git-send-email-malcolm.crossley@citrix.com> <1446573502-8019-2-git-send-email-malcolm.crossley@citrix.com> <564B6C1A02000078000B603C@prv-mh.provo.novell.com> <564B6453.6050008@citrix.com> <564B746802000078000B60E1@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZykS2-0004Dz-8N for xen-devel@lists.xenproject.org; Tue, 17 Nov 2015 17:53:54 +0000 In-Reply-To: <564B746802000078000B60E1@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xenproject.org, Malcolm Crossley , keir@xen.org, stefano.stabellini@citrix.com, ian.campbell@citrix.com List-Id: xen-devel@lists.xenproject.org On 17/11/15 17:39, Jan Beulich wrote: >>>> On 17.11.15 at 18:30, wrote: >> On 17/11/15 17:04, Jan Beulich wrote: >>>>>> On 03.11.15 at 18:58, wrote: >>>> --- a/xen/common/grant_table.c >>>> +++ b/xen/common/grant_table.c >>>> @@ -178,6 +178,10 @@ struct active_grant_entry { >>>> #define _active_entry(t, e) \ >>>> ((t)->active[(e)/ACGNT_PER_PAGE][(e)%ACGNT_PER_PAGE]) >>>> >>>> +bool_t grant_rwlock_barrier; >>>> + >>>> +DEFINE_PER_CPU(rwlock_t *, grant_rwlock); >>> Shouldn't these be per grant table? And wouldn't doing so eliminate >>> the main limitation of the per-CPU rwlocks? >> The grant rwlock is per grant table. > That's understood, but I don't see why the above items aren't, too. Ah - because there is never any circumstance where two grant tables are locked on the same pcpu. Nor is there any such need. > >> The entire point of this series is to reduce the cmpxchg storm which >> happens when many pcpus attempt to grap the same domains grant read lock. >> >> As identified in the commit message, reducing the cmpxchg pressure on >> the cache coherency fabric increases intra-vm network through from >> 10Gbps to 50Gbps when running iperf between two 16-vcpu guests. >> >> Or in other words, 80% of cpu time is wasted with waiting on an atomic >> read/modify/write operation against a remote hot cache line. > All of this is pretty nice, but again unrelated to the question I > raised. > > The whole interface would likely become quite a bit easier to use > if there was a percpu_rwlock_t comprising all three elements (the > per-CPU item obviously would need to become a per-CPU pointer, > with allocation of per-CPU data needing introduction). Runtime per-CPU data allocation is incompatible with our current scheme (which relies on the linker to do some of the heavy lifting). ~Andrew