From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Vrabel <david.vrabel@citrix.com>
Subject: Re: [PATCHv7 3/3] gnttab: use per-VCPU maptrack free
 lists
Date: Tue, 12 May 2015 12:01:13 +0100
Message-ID: <5551DD79.40507@citrix.com>
References: <1430400525-31064-1-git-send-email-david.vrabel@citrix.com>
	<1430400525-31064-4-git-send-email-david.vrabel@citrix.com>
	<5548D4F60200007800076AA9@mail.emea.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <david.vrabel@citrix.com>) id 1Ys7wF-0001T0-Li
	for xen-devel@lists.xenproject.org; Tue, 12 May 2015 11:01:27 +0000
In-Reply-To: <5548D4F60200007800076AA9@mail.emea.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>, David Vrabel <david.vrabel@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Ian Campbell <ian.campbell@citrix.com>, Christoph Egger <chegger@amazon.de>, Tim Deegan <tim@xen.org>, Matt Wilson <msw@amazon.com>, xen-devel@lists.xenproject.org, Malcolm Crossley <malcolm.crossley@citrix.com>
List-Id: xen-devel@lists.xenproject.org

On 05/05/15 13:34, Jan Beulich wrote:
>>>> On 30.04.15 at 15:28, <david.vrabel@citrix.com> wrote:
>> From: Malcolm Crossley <malcolm.crossley@citrix.com>
>>
>> Performance analysis of aggregate network throughput with many VMs
>> shows that performance is signficantly limited by contention on the
>> maptrack lock when obtaining/releasing maptrack handles from the free
>> list.
>>
>> Instead of a single free list use a per-VCPU list. This avoids any
>> contention when obtaining a handle.  Handles must be released back to
>> their original list and since this may occur on a different VCPU there
>> is some contention on the destination VCPU's free list tail pointer
>> (but this is much better than a per-domain lock).
>>
>> A domain's maptrack limit is multiplied by the number of VCPUs.  This
>> ensures that a worst case domain that only performs grant table
>> operations via one VCPU will not see a lower map track limit.
[...]
>> +    cur_tail = v->maptrack_tail;
> 
> read_atomic()?

It's not required since if this load gets inconsistent state, the
cmpxchg loop will just go around once more.  I've added the
read_atomic() anyway, though.

>> @@ -1430,6 +1456,17 @@ gnttab_setup_table(
>>      gt = d->grant_table;
>>      write_lock(&gt->lock);
>>  
>> +    /* Tracking of mapped foreign frames table */
>> +    if ( (gt->maptrack = xzalloc_array(struct grant_mapping *,
>> +                                       max_maptrack_frames * d->max_vcpus)) == NULL )
>> +        goto out3;
> 
> This surely can easily become an allocation of far more than a page,
> and hence needs to be broken up (perhaps using vmap() to map
> individually allocated pages).

I think there should be a common vzalloc_array() function.  Do you
agree?  This will use xzalloc_array() if the alloc is < PAGE_SIZE to
avoid needlessly using vmap space.

David