xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com, Ian.Campbell@citrix.com
Subject: Re: [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver
Date: Wed, 15 Dec 2010 09:18:54 -0500	[thread overview]
Message-ID: <4D08CE4E.2050505@tycho.nsa.gov> (raw)
In-Reply-To: <4D07F266.9000008@goop.org>

On 12/14/2010 05:40 PM, Jeremy Fitzhardinge wrote:
> On 12/14/2010 02:06 PM, Daniel De Graaf wrote:
>>>> +static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
>>>> +{
>>>> +	struct gntalloc_file_private_data *priv = filp->private_data;
>>>> +	struct gntalloc_gref *gref;
>>>> +
>>>> +	if (debug)
>>>> +		printk("%s: priv %p, page %lu\n", __func__,
>>>> +		       priv, vma->vm_pgoff);
>>>> +
>>>> +	/*
>>>> +	 * There is a 1-to-1 correspondence of grant references to shared
>>>> +	 * pages, so it only makes sense to map exactly one page per
>>>> +	 * call to mmap().
>>>> +	 */
>>> Single-page mmap makes sense if the only possible use-cases are for
>>> single-page mappings, but if you're talking about framebuffers and the
>>> like is seems like a very awkward way to use mmap.  It would be cleaner
>>> from an API perspective to have a user-mode defined flat address space
>>> indexed by pgoff which maps to an array of grefs, so you can sensibly do
>>> a multi-page mapping.
>>>
>>> It would also allow you to hide the grefs from usermode entirely.  Then
>>> its just up to usermode to choose suitable file offsets for itself.
>> I considered this, but wanted to keep userspace compatability with the
>> previously created interface.
> 
> Is that private to you, or something in broader use?

This module was used as part of Qubes (http://www.qubes-os.org). The device
path has changed (/dev/gntalloc to /dev/xen/gntalloc), and the API change
adds useful functionality, so I don't think we must keep compatibility. This
will also allow cleaning up the interface to remove parameters that make no
sense (owner_domid, for example).

>>  If there's no reason to avoid doing so, I'll
>> change the ioctl interface to allocate an array of grants and calculate the
>> offset similar to how gntdev does currently (picks a suitable open slot).
> 
> I guess there's three options: you could get the kernel to allocate
> extents, make usermode do it, or have one fd per extent and always start
> from offset 0.  I guess the last could get very messy if you want to
> have lots of mappings...  Making usermode define the offsets seems
> simplest and most flexible, because then they can stitch together the
> file-offset space in any way that's convenient to them (you just need to
> deal with overlaps in that space).

Would it be useful to also give userspace control over the offsets in gntdev?

One argument for doing it in the kernel is to avoid needing to track what
offsets are already being used (and then having the kernel re-check that).
While this isn't hard, IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR only exists in
order to relieve userspace of the need to track its mappings, so this
seems to have been a concern before.

Another use case of gntalloc that may prove useful is to have more than
one application able to map the same grant within the kernel. For gntdev,
this can be done by mapping the pages multiple times (although that may
not be the most efficient). Moving the gntalloc lists out of the file
private data would allow any user of gntalloc to map any shared page. The
primary reason not to do this is that it prevents automatic cleanup of
allocated pages on close (which is important when the userspace app doing
the mapping crashes).

>> Userspace does still have to know about grefs, of course, but only to pass
>> to the domain doing the mapping, not for its own mmap().
> 
> Ah, yes, of course.
> 
>>>> +
>>>> +	/* This flag prevents Bad PTE errors when the memory is unmapped. */
>>>> +	vma->vm_flags |= VM_RESERVED;
>>>> +	vma->vm_flags |= VM_DONTCOPY;
>>>> +	vma->vm_flags |= VM_IO;
>>> If you set VM_PFNMAP then you don't need to deal with faults.
>> Useful to know. Is that more efficient/preferred to defining a
>> .fault handler? I used this method because that's what is used in
>> kernel/relay.c.
> 
> Well, as you currently have it, your mmap() function doesn't map
> anything, so you're relying on demand faulting to populate the ptes. 
> Since you've already allocated the pages that's just a soft fault, but
> it means you end up with a lot of per-page entries into the hypervisor.
> 
> If you make mmap pre-populate all the ptes (a nice big fat batch
> operation), then you should never get faults on the vma.  You can set
> PFNMAP to make sure of that (since you're already setting all the
> "woowoo vma" flags, that makes sense).
> 
> Its actual meaning is "this vma contains pages which are not really
> kernel memory, so paging them doesn't make sense" - ie, device or
> foreign memory (we use it in gntdev).
> 
> In this case, the pages are normal kernel pages but they're being given
> over to a "device" and so are no longer subject to normal kernel
> lifetime rules.  So I think PFNMAP makes sense in this case too.
> 
> 
>     J
> 

Agreed; once mapped, the frame numbers (GFN & MFN) won't change until
they are unmapped, so pre-populating them will be better.


-- 
Daniel De Graaf
National Security Agency

  reply	other threads:[~2010-12-15 14:18 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14 14:55 [PATCH v2] Userspace grant communication Daniel De Graaf
2010-12-14 14:55 ` [PATCH 1/6] xen-gntdev: Fix circular locking dependency Daniel De Graaf
2010-12-14 21:11   ` Jeremy Fitzhardinge
2010-12-14 21:40     ` Daniel De Graaf
2010-12-15  9:47       ` Ian Campbell
2010-12-16  0:28         ` Jeremy Fitzhardinge
2010-12-16 15:09           ` Stefano Stabellini
2010-12-14 14:55 ` [PATCH 2/6] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
2010-12-14 21:12   ` Jeremy Fitzhardinge
2010-12-14 21:42     ` Daniel De Graaf
2010-12-15  9:50       ` Ian Campbell
2010-12-16  0:27         ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 3/6] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
2010-12-14 21:15   ` Jeremy Fitzhardinge
2010-12-14 21:52     ` Daniel De Graaf
2010-12-14 21:56       ` Jeremy Fitzhardinge
2010-12-14 21:54   ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 4/6] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
2010-12-14 21:20   ` Jeremy Fitzhardinge
2010-12-15  9:58     ` Ian Campbell
2010-12-16  0:29       ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2010-12-14 21:42   ` Jeremy Fitzhardinge
2010-12-14 22:06     ` Daniel De Graaf
2010-12-14 22:40       ` Jeremy Fitzhardinge
2010-12-15 14:18         ` Daniel De Graaf [this message]
2010-12-16  1:05           ` Jeremy Fitzhardinge
2010-12-16 15:22             ` Daniel De Graaf
2010-12-16 19:14               ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 6/6] xen-gntdev: Introduce HVM version of gntdev Daniel De Graaf
2010-12-14 21:45   ` Jeremy Fitzhardinge
2010-12-14 22:27     ` Daniel De Graaf
  -- strict thread matches above, loose matches on Subject: below --
2011-01-21 15:59 [SPAM] [PATCH v5] Userspace grant communication Daniel De Graaf
2011-01-21 15:59 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-01-27 18:52   ` Konrad Rzeszutek Wilk
2011-01-27 19:23     ` Konrad Rzeszutek Wilk
2011-01-27 19:51       ` Daniel De Graaf
2011-01-27 20:55     ` Daniel De Graaf
2011-01-27 21:29       ` Konrad Rzeszutek Wilk
2011-02-03 17:18 [PATCH v6] Userspace grant communication Daniel De Graaf
2011-02-03 17:19 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-02-08 22:48   ` Konrad Rzeszutek Wilk
2011-02-09 18:52     ` Daniel De Graaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D08CE4E.2050505@tycho.nsa.gov \
    --to=dgdegra@tycho.nsa.gov \
    --cc=Ian.Campbell@citrix.com \
    --cc=jeremy@goop.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).