All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: xen-devel@lists.xensource.com, Ian.Campbell@citrix.com
Subject: Re: [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver
Date: Tue, 14 Dec 2010 14:40:38 -0800	[thread overview]
Message-ID: <4D07F266.9000008@goop.org> (raw)
In-Reply-To: <4D07EA5C.8050605@tycho.nsa.gov>

On 12/14/2010 02:06 PM, Daniel De Graaf wrote:
>>> +static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
>>> +{
>>> +	struct gntalloc_file_private_data *priv = filp->private_data;
>>> +	struct gntalloc_gref *gref;
>>> +
>>> +	if (debug)
>>> +		printk("%s: priv %p, page %lu\n", __func__,
>>> +		       priv, vma->vm_pgoff);
>>> +
>>> +	/*
>>> +	 * There is a 1-to-1 correspondence of grant references to shared
>>> +	 * pages, so it only makes sense to map exactly one page per
>>> +	 * call to mmap().
>>> +	 */
>> Single-page mmap makes sense if the only possible use-cases are for
>> single-page mappings, but if you're talking about framebuffers and the
>> like is seems like a very awkward way to use mmap.  It would be cleaner
>> from an API perspective to have a user-mode defined flat address space
>> indexed by pgoff which maps to an array of grefs, so you can sensibly do
>> a multi-page mapping.
>>
>> It would also allow you to hide the grefs from usermode entirely.  Then
>> its just up to usermode to choose suitable file offsets for itself.
> I considered this, but wanted to keep userspace compatability with the
> previously created interface.

Is that private to you, or something in broader use?

>  If there's no reason to avoid doing so, I'll
> change the ioctl interface to allocate an array of grants and calculate the
> offset similar to how gntdev does currently (picks a suitable open slot).

I guess there's three options: you could get the kernel to allocate
extents, make usermode do it, or have one fd per extent and always start
from offset 0.  I guess the last could get very messy if you want to
have lots of mappings...  Making usermode define the offsets seems
simplest and most flexible, because then they can stitch together the
file-offset space in any way that's convenient to them (you just need to
deal with overlaps in that space).

> Userspace does still have to know about grefs, of course, but only to pass
> to the domain doing the mapping, not for its own mmap().

Ah, yes, of course.

>>> +	if (((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) != 1) {
>>> +		printk(KERN_ERR "%s: Only one page can be memory-mapped "
>>> +			"per grant reference.\n", __func__);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	if (!(vma->vm_flags & VM_SHARED)) {
>>> +		printk(KERN_ERR "%s: Mapping must be shared.\n",
>>> +			__func__);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	spin_lock(&gref_lock);
>>> +	gref = find_gref(priv, vma->vm_pgoff);
>>> +	if (gref == NULL) {
>>> +		spin_unlock(&gref_lock);
>>> +		printk(KERN_ERR "%s: Could not find a grant reference with "
>>> +			"page index %lu.\n", __func__, vma->vm_pgoff);
>>> +		return -ENOENT;
>>> +	}
>>> +	gref->users++;
>>> +	spin_unlock(&gref_lock);
>>> +
>>> +	vma->vm_private_data = gref;
>>> +
>>> +	/* This flag prevents Bad PTE errors when the memory is unmapped. */
>>> +	vma->vm_flags |= VM_RESERVED;
>>> +	vma->vm_flags |= VM_DONTCOPY;
>>> +	vma->vm_flags |= VM_IO;
>> If you set VM_PFNMAP then you don't need to deal with faults.
> Useful to know. Is that more efficient/preferred to defining a
> .fault handler? I used this method because that's what is used in
> kernel/relay.c.

Well, as you currently have it, your mmap() function doesn't map
anything, so you're relying on demand faulting to populate the ptes. 
Since you've already allocated the pages that's just a soft fault, but
it means you end up with a lot of per-page entries into the hypervisor.

If you make mmap pre-populate all the ptes (a nice big fat batch
operation), then you should never get faults on the vma.  You can set
PFNMAP to make sure of that (since you're already setting all the
"woowoo vma" flags, that makes sense).

Its actual meaning is "this vma contains pages which are not really
kernel memory, so paging them doesn't make sense" - ie, device or
foreign memory (we use it in gntdev).

In this case, the pages are normal kernel pages but they're being given
over to a "device" and so are no longer subject to normal kernel
lifetime rules.  So I think PFNMAP makes sense in this case too.


    J

  reply	other threads:[~2010-12-14 22:40 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-14 14:55 [PATCH v2] Userspace grant communication Daniel De Graaf
2010-12-14 14:55 ` [PATCH 1/6] xen-gntdev: Fix circular locking dependency Daniel De Graaf
2010-12-14 21:11   ` Jeremy Fitzhardinge
2010-12-14 21:40     ` Daniel De Graaf
2010-12-15  9:47       ` Ian Campbell
2010-12-16  0:28         ` Jeremy Fitzhardinge
2010-12-16 15:09           ` Stefano Stabellini
2010-12-14 14:55 ` [PATCH 2/6] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
2010-12-14 21:12   ` Jeremy Fitzhardinge
2010-12-14 21:42     ` Daniel De Graaf
2010-12-15  9:50       ` Ian Campbell
2010-12-16  0:27         ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 3/6] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
2010-12-14 21:15   ` Jeremy Fitzhardinge
2010-12-14 21:52     ` Daniel De Graaf
2010-12-14 21:56       ` Jeremy Fitzhardinge
2010-12-14 21:54   ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 4/6] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
2010-12-14 21:20   ` Jeremy Fitzhardinge
2010-12-15  9:58     ` Ian Campbell
2010-12-16  0:29       ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2010-12-14 21:42   ` Jeremy Fitzhardinge
2010-12-14 22:06     ` Daniel De Graaf
2010-12-14 22:40       ` Jeremy Fitzhardinge [this message]
2010-12-15 14:18         ` Daniel De Graaf
2010-12-16  1:05           ` Jeremy Fitzhardinge
2010-12-16 15:22             ` Daniel De Graaf
2010-12-16 19:14               ` Jeremy Fitzhardinge
2010-12-14 14:55 ` [PATCH 6/6] xen-gntdev: Introduce HVM version of gntdev Daniel De Graaf
2010-12-14 21:45   ` Jeremy Fitzhardinge
2010-12-14 22:27     ` Daniel De Graaf
  -- strict thread matches above, loose matches on Subject: below --
2011-01-21 15:59 [SPAM] [PATCH v5] Userspace grant communication Daniel De Graaf
2011-01-21 15:59 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-01-27 18:52   ` Konrad Rzeszutek Wilk
2011-01-27 19:23     ` Konrad Rzeszutek Wilk
2011-01-27 19:51       ` Daniel De Graaf
2011-01-27 20:55     ` Daniel De Graaf
2011-01-27 21:29       ` Konrad Rzeszutek Wilk
2011-02-03 17:18 [PATCH v6] Userspace grant communication Daniel De Graaf
2011-02-03 17:19 ` [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-02-08 22:48   ` Konrad Rzeszutek Wilk
2011-02-09 18:52     ` Daniel De Graaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D07F266.9000008@goop.org \
    --to=jeremy@goop.org \
    --cc=Ian.Campbell@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.