From: David Vrabel <david.vrabel@citrix.com>
To: David Vrabel <david.vrabel@citrix.com>,
"Xen-devel@lists.xen.org" <Xen-devel@lists.xen.org>
Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
Subject: Re: Linux grant map/unmap improvement proposal (Draft B)
Date: Thu, 18 Dec 2014 17:55:51 +0000 [thread overview]
Message-ID: <54931527.3070204@citrix.com> (raw)
In-Reply-To: <543BD686.3080006@citrix.com>
On 13/10/14 14:41, David Vrabel wrote:
>
> Design
> ======
Jennifer has put together most of the initial implementation of this so
expect a full series some time next year.
It didn't quite end up as described here.
> Userspace address to page translation
> -------------------------------------
>
> The m2p_override table shall be removed.
>
> Each VMA (struct vm_struct) shall contain an additional pointer to an
> optional array of pages. This array shall be sized to cover the full
> extent of the VMA.
>
> The gntdev driver populates this array with the relevant pages for the
> foreign mappings as they are mapped. It shall also clear them when
> unmapping. The gntdev driver must ensure it properly splits the page
> array when the VMA itself is split.
>
> Since the m2p lookup will not return a local PFN, the native
> get_user_pages_fast() call will fail. Prior to attempting to fault in
> the pages, get_user_pages() can simply look up the pages in the VMA's
> page array.
This was not true. Instead, we mark the userspace PTEs as special
(_PAGE_SPECIAL set) which causes the generic x86 code to skip the fast path.
We also changed vm_normal_page() to look in vma->pages which puts the
extra code outside of any common use case (i.e., away from any handling
of non-special mappings), further reducing the impact on existing use cases.
For the curious, the 3-liner mm/memory.c change is below (although this
does not handle VMA splitting yet, but that should be straight-forwards).
> Userspace grant performance
> ---------------------------
>
> - Lazily map grants into userspace on faults. For applications that
> do not access the foreign frames by the userspace mappings (such as
> block backends using direct I/O) this would avoid a set of maps and
> unmaps. This lazy mode would have to be requested by the userspace
> program (since faulting many pages would be much more expensive than
> a single batched map).
This does not look possible without more invasive changes to core MM
code. Although we can lazily fault in the mappings we still need PTEs
to allow get_user_pages() to work, so map-on-fault isn't useful.
David
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -289,6 +289,7 @@ struct vm_area_struct {
#ifdef CONFIG_NUMA
struct mempolicy *vm_policy; /* NUMA policy for the VMA */
#endif
+ struct page **pages;
};
struct core_thread {
diff --git a/mm/memory.c b/mm/memory.c
index 4b60011..3ca13bb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -774,6 +774,8 @@ struct page *vm_normal_page(struct vm_area_struct
*vma, unsigned long addr,
if (HAVE_PTE_SPECIAL) {
if (likely(!pte_special(pte)))
goto check_pfn;
+ if (vma->pages)
+ return vma->pages[(addr - vma->vm_start) >> PAGE_SHIFT];
if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
return NULL;
if (!is_zero_pfn(pfn))
--
1.7.10.4
prev parent reply other threads:[~2014-12-18 17:55 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-13 13:41 Linux grant map/unmap improvement proposal (Draft B) David Vrabel
2014-10-13 16:43 ` Stefano Stabellini
2014-10-13 17:22 ` David Vrabel
2014-10-14 10:27 ` Ian Campbell
2014-10-14 10:32 ` David Vrabel
2014-10-14 10:35 ` Ian Campbell
2014-10-14 12:49 ` David Vrabel
2014-10-14 12:59 ` Ian Campbell
2014-10-15 17:45 ` Zoltan Kiss
2014-10-16 15:54 ` David Vrabel
2014-12-18 17:55 ` David Vrabel [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54931527.3070204@citrix.com \
--to=david.vrabel@citrix.com \
--cc=Xen-devel@lists.xen.org \
--cc=jennifer.herbert@citrix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.