From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: [PATCHv7 3/3] gnttab: use per-VCPU maptrack free lists Date: Tue, 12 May 2015 12:01:13 +0100 Message-ID: <5551DD79.40507@citrix.com> References: <1430400525-31064-1-git-send-email-david.vrabel@citrix.com> <1430400525-31064-4-git-send-email-david.vrabel@citrix.com> <5548D4F60200007800076AA9@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Ys7wF-0001T0-Li for xen-devel@lists.xenproject.org; Tue, 12 May 2015 11:01:27 +0000 In-Reply-To: <5548D4F60200007800076AA9@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , David Vrabel Cc: Keir Fraser , Ian Campbell , Christoph Egger , Tim Deegan , Matt Wilson , xen-devel@lists.xenproject.org, Malcolm Crossley List-Id: xen-devel@lists.xenproject.org On 05/05/15 13:34, Jan Beulich wrote: >>>> On 30.04.15 at 15:28, wrote: >> From: Malcolm Crossley >> >> Performance analysis of aggregate network throughput with many VMs >> shows that performance is signficantly limited by contention on the >> maptrack lock when obtaining/releasing maptrack handles from the free >> list. >> >> Instead of a single free list use a per-VCPU list. This avoids any >> contention when obtaining a handle. Handles must be released back to >> their original list and since this may occur on a different VCPU there >> is some contention on the destination VCPU's free list tail pointer >> (but this is much better than a per-domain lock). >> >> A domain's maptrack limit is multiplied by the number of VCPUs. This >> ensures that a worst case domain that only performs grant table >> operations via one VCPU will not see a lower map track limit. [...] >> + cur_tail = v->maptrack_tail; > > read_atomic()? It's not required since if this load gets inconsistent state, the cmpxchg loop will just go around once more. I've added the read_atomic() anyway, though. >> @@ -1430,6 +1456,17 @@ gnttab_setup_table( >> gt = d->grant_table; >> write_lock(>->lock); >> >> + /* Tracking of mapped foreign frames table */ >> + if ( (gt->maptrack = xzalloc_array(struct grant_mapping *, >> + max_maptrack_frames * d->max_vcpus)) == NULL ) >> + goto out3; > > This surely can easily become an allocation of far more than a page, > and hence needs to be broken up (perhaps using vmap() to map > individually allocated pages). I think there should be a common vzalloc_array() function. Do you agree? This will use xzalloc_array() if the alloc is < PAGE_SIZE to avoid needlessly using vmap space. David