linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Vrabel <david.vrabel@citrix.com>
To: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org
Cc: David Vrabel <david.vrabel@citrix.com>,
	linux-mm@kvack.org, xen-devel@lists.xenproject.org,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: [PATCH 1/2] mm: allow for an alternate set of pages for userspace mappings
Date: Thu, 8 Jan 2015 15:28:43 +0000	[thread overview]
Message-ID: <1420730924-22811-2-git-send-email-david.vrabel@citrix.com> (raw)
In-Reply-To: <1420730924-22811-1-git-send-email-david.vrabel@citrix.com>

Add an optional array of pages to struct vm_area_struct that can be
used find the page backing a VMA.  This is useful in cases where the
normal mechanisms for finding the page don't work.  This array is only
inspected if the PTE is special.

Splitting a VMA with such an array of pages is trivially done by
adjusting vma->pages.  The original creator of the VMA must only free
the page array once all sub-VMAs are closed (e.g., by ref-counting in
vm_ops->open and vm_ops->close).

One use case is a Xen PV guest mapping foreign pages into userspace.

In a Xen PV guest, the PTEs contain MFNs so get_user_pages() (for
example) must do an MFN to PFN (M2P) lookup before it can get the
page.  For foreign pages (those owned by another guest) the M2P lookup
returns the PFN as seen by the foreign guest (which would be
completely the wrong page for the local guest).

This cannot be fixed up improving the M2P lookup since one MFN may be
mapped onto two or more pages so getting the right page is impossible
given just the MFN.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 include/linux/mm_types.h |    8 ++++++++
 mm/memory.c              |    2 ++
 mm/mmap.c                |   12 +++++++++++-
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6d34aa2..4f34609 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -309,6 +309,14 @@ struct vm_area_struct {
 #ifdef CONFIG_NUMA
 	struct mempolicy *vm_policy;	/* NUMA policy for the VMA */
 #endif
+	/*
+	 * Array of pages to override the default vm_normal_page()
+	 * result iff the PTE is special.
+	 *
+	 * The memory for this should be refcounted in vm_ops->open
+	 * and vm_ops->close.
+	 */
+	struct page **pages;
 };
 
 struct core_thread {
diff --git a/mm/memory.c b/mm/memory.c
index ca920d1..98520f6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -754,6 +754,8 @@ struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 	if (HAVE_PTE_SPECIAL) {
 		if (likely(!pte_special(pte)))
 			goto check_pfn;
+		if (vma->pages)
+			return vma->pages[(addr - vma->vm_start) >> PAGE_SHIFT];
 		if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
 			return NULL;
 		if (!is_zero_pfn(pfn))
diff --git a/mm/mmap.c b/mm/mmap.c
index 7b36aa7..504dc5c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2448,6 +2448,7 @@ static int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
 	      unsigned long addr, int new_below)
 {
 	struct vm_area_struct *new;
+	unsigned long delta;
 	int err = -ENOMEM;
 
 	if (is_vm_hugetlb_page(vma) && (addr &
@@ -2463,11 +2464,20 @@ static int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
 
 	INIT_LIST_HEAD(&new->anon_vma_chain);
 
+	delta = (addr - vma->vm_start) >> PAGE_SHIFT;
+
 	if (new_below)
 		new->vm_end = addr;
 	else {
 		new->vm_start = addr;
-		new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
+		new->vm_pgoff += delta;
+	}
+
+	if (vma->pages) {
+		if (new_below)
+			vma->pages += delta;
+		else
+			new->pages += delta;
 	}
 
 	err = vma_dup_policy(vma, new);
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-01-08 15:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-08 15:28 [PATCHv1 0/2] mm: infrastructure for correctly handling foreign pages on Xen David Vrabel
2015-01-08 15:28 ` David Vrabel [this message]
2015-01-08 17:20   ` [PATCH 1/2] mm: allow for an alternate set of pages for userspace mappings Johannes Weiner
2015-01-08 17:50     ` David Vrabel
2015-01-08 15:28 ` [PATCH 2/2] mm: add 'foreign' alias for the 'pinned' page flag David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1420730924-22811-2-git-send-email-david.vrabel@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=akpm@linux-foundation.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).