Re: Possible Xen grant table locking improvements

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Vrabel <david.vrabel@citrix.com>
To: Tim Deegan <tim@xen.org>, David Vrabel <david.vrabel@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Matt Wilson <msw@amazon.com>,
	Jan Beulich <jbeulich@suse.com>,
	"Xen-devel@lists.xen.org" <Xen-devel@lists.xen.org>
Subject: Re: Possible Xen grant table locking improvements
Date: Fri, 7 Nov 2014 15:41:37 +0000	[thread overview]
Message-ID: <545CE831.7070409@citrix.com> (raw)
In-Reply-To: <20141023094619.GA89807@deinos.phlegethon.org>

On 23/10/14 10:46, Tim Deegan wrote:
> Hi,
> 
> At 18:35 +0100 on 20 Oct (1413826547), David Vrabel wrote:
>> Most guests do not map a grant reference more than twice (Linux, for
>> example, will typically map a gref once in the kernel address space, or
>> twice if a userspace mapping is required).  The maptrack entries for
>> these two mappings can be stored in the active entry (the "fast"
>> entries).  If more than two mappings are required, the existing maptrack
>> table can be used (the "slow" entries).
> 
> Sounds good, as long as the hit rate is indeed high.  Do you know if
> the BSD/windows client code behaves this way too?

I don't know about BSD but XenServer's Windows PV drivers don't support
mapping grants (only granting access).

>> A maptrack handle for a "fast" entry is encoded as:
>>
>>     31 30          16  15            0
>>   +---+---------------+---------------+
>>   | F | domid         | gref          |
>>   +---+---------------+---------------+
>>
>> F is set for a "fast" entry, and clear for a "slow" one. Grant
>> references above 2^16 will have to be tracked with "slow" entries.
> 
> How restricting is that limit?  Would 2^15½ and also encoding
> which of the two entries to look at be good?

Oh,  I forgot about the bit for the entry index.

2^15 entries allows for (e.g.,) 8 multiqueue VIFs in a 8 VCPU guest.
Which is not a huge number and not a limit I would like to introduce.

One possibility would be guests wanting to use the fast path have to use
a new grant unmap hypercall that also passes the original grant ref and
domain.

>> We can omit taking the grant table lock to check the validity of a grant
>> ref or maptrack handle since these tables only grow and do not shrink.
> 
> Can you also avoid the lock for accessing the entry itself, with a bit
> of RCU magic?  Maybe that's overengineering things.

I don't think this will be necessary -- the active entry lock won't be
contended.

>> If strict IOMMU mode is used, IOMMU mappings are updated on every grant
>> map/unmap.  These are currently setup such that BFN == MFN which
>> requires reference counting the IOMMU mappings so they are only torn
>> down when all grefs for that MFN are unmapped.  This requires an
>> expensive mapcount() operation that iterates over the whole maptrack table.
>>
>> There is no requirement for BFN == MFN so each grant map can create its
>> own IOMMU mapping.  This will require a region of bus address space that
>> does not overlap with RAM.
> 
> Hrmn.  That could be tricky to arrange.  And the reference counting
> might end up being cheaper than the extra IOMMU flush operations.
> (Also, how much would you bet that clients actually use the returned
> BFN correctly?)

Hmm. Yes, if a guest has assumed that BFN == MFN, it could read the PTE
and get the BFN that way.

> Would it be enough to optimise mapcount() a bit?  We could organise the
> in-use maptrack entries as a hash table instead of (or as well as) a
> single linked list.
> 
> On similar lines, would it be worth fragmenting the maptrack itself
> (e.g. with per-page locks) to reduce locking contention instead of
> moving maptrack entries into the active entry?  If might be Good
> Enough[tm], and simpler to build/maintain than this proposal.

A per maptrack page lock you would still need a per-domain maptrack lock
protecting the maptrack free list.

A better idea may be to hash the domain and grant ref and have a
maptrack table per-hash bucket.  Each maptrack table would have its own
maptrack lock.

The maptrack handle could be:

    31               16  15            0
    +-------------------+---------------+
    | bucket            | index         |
    +-------------------+---------------+

We should probably try something simpler like this before getting
carried away...

David

     prev parent reply	other threads:[~2014-11-07 15:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 17:35 Possible Xen grant table locking improvements David Vrabel
2014-10-20 18:06 ` Matt Wilson
2014-10-21  9:32   ` Jan Beulich
2014-10-23  9:46 ` Tim Deegan
2014-11-07 15:41   ` David Vrabel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545CE831.7070409@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=Xen-devel@lists.xen.org \
    --cc=jbeulich@suse.com \
    --cc=keir@xen.org \
    --cc=msw@amazon.com \
    --cc=tim@xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.