All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
To: "Thomas Hellström" <thellstrom@vmware.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	"Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>,
	Linux kernel mailing list <linux-kernel@vger.kernel.org>,
	"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: 2.6.29 pat issue
Date: Tue, 3 Mar 2009 22:08:58 -0800	[thread overview]
Message-ID: <20090304060857.GA18318@linux-os.sc.intel.com> (raw)
In-Reply-To: <498C062C.201@vmware.com>

On Fri, Feb 06, 2009 at 01:43:08AM -0800, Thomas Hellström wrote:
> Eric W. Biederman wrote:
> > Thomas Hellstrom <thellstrom@vmware.com> writes:
> >
> >
> >
> >> Indeed, it's crucial to keep the mappings consistent, but failure to do so is a
> >> kernel driver bug, it should never be the result of invalid user data.
> >>
> >
> > It easily can be.  Think of an X server mmaping frame buffers. Or other
> > device bars.
> >
> >
> Hmm, Yes  you're right, although I'm still a bit doubtful about RAM pages.
> 
> Wait. Now I see what's causing the problems. The code is  assuming that
> VM_PFNMAP vmas never map RAM pages. That's also an invalid assumption.
> See comments in mm/memory.c
> 
> So probably the attribute check should be done for the insert_pfn path
> of VM_MIXEDMAP as well. That's not done today.
> 
> So there are three distinct bugs at this point:
> 
> 1) VMAs with VM_PFNMAP are incorrectly assumed to be linear if
> vma->vm_pgoff non-null.
> 2) VM_PFNMAP VMA PTEs are incorrectly assumed to never point to physical
> RAM.
> 3) There is no check for the insert_pfn  path of vm_insert_mixed().
> 

Patch below will solve (1) above.

About (2), Yes. we can optimize the PAT code if we use struct page to track
PFNMAP as long at memory is backed by a struct page. It has some complications
with refcounting the number of mappings and related things. We are actively
looking at it. About (3), vm_insert_mixed was not used by any in kernel driver,
so, we did not add checks there, with the intention of fixing most commonly
used remap_pfn_range and vm_insert_pfn first.

Below patch should fix the regression upstream. I don't like the way we
overloaded a bit here. But, we don't seem to see any other option.
Nick: Do you see any cleaner way to do this?

Thanks,
Venki

Subject: [PATCH] VM, x86 PAT: Change implementation of is_linear_pfn_mapping

Use of vma->vm_pgoff to identify the pfnmaps that are fully mapped at
mmap time is broken, as vm_pgoff can also be set when full mapping is
not setup at mmap time.
http://marc.info/?l=linux-kernel&m=123383810628583&w=2

Change the logic to overload VM_NONLINEAR flag along with VM_PFNMAP to
mean full mapping setup at mmap time. This distinction is needed by x86 PAT
code.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
 include/linux/mm.h |    8 +++++++-
 mm/memory.c        |    2 ++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 065cdf8..6c3fc3a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,12 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_RESERVED | VM_PFNMAP)
 
 /*
+ * pfnmap vmas that are fully mapped at mmap time (not mapped on fault).
+ * Used by x86 PAT to identify such PFNMAP mappings and optimize their handling.
+ */
+#define VM_PFNMAP_AT_MMAP (VM_NONLINEAR | VM_PFNMAP)
+
+/*
  * mapping from the currently active vm_flags protection bits (the
  * low four bits) to a page protection mask..
  */
@@ -145,7 +151,7 @@ extern pgprot_t protection_map[16];
  */
 static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
 {
-	return ((vma->vm_flags & VM_PFNMAP) && vma->vm_pgoff);
+	return ((vma->vm_flags & VM_PFNMAP_AT_MMAP) == VM_PFNMAP_AT_MMAP);
 }
 
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
diff --git a/mm/memory.c b/mm/memory.c
index baa999e..457e97e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1671,6 +1671,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 		return -EINVAL;
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
+	vma->vm_flags |= VM_PFNMAP_AT_MMAP;
 
 	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
 	if (err) {
@@ -1679,6 +1680,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
+		vma->vm_flags &= ~VM_PFNMAP_AT_MMAP;
 		return -EINVAL;
 	}
 
-- 
1.6.0.6


  reply	other threads:[~2009-03-04  6:09 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-05 12:47 2.6.29 pat issue Thomas Hellström
2009-02-05 18:03 ` Pallipadi, Venkatesh
2009-02-05 21:32   ` Thomas Hellstrom
2009-02-05 23:08     ` Pallipadi, Venkatesh
2009-02-06  9:51       ` Thomas Hellström
2009-02-06  1:11     ` Eric W. Biederman
2009-02-06  9:43       ` Thomas Hellström
2009-03-04  6:08         ` Pallipadi, Venkatesh [this message]
2009-03-04  9:56           ` Thomas Hellstrom
2009-03-06 22:38             ` Pallipadi, Venkatesh
2009-03-06 23:44               ` Thomas Hellstrom
2009-03-10  1:39                 ` Pallipadi, Venkatesh
2009-03-10  8:22                   ` Thomas Hellstrom
2009-03-10 17:42                     ` Pallipadi, Venkatesh
2009-03-11  9:17                       ` Thomas Hellstrom
2009-03-11  9:33                         ` Ingo Molnar
2009-03-11 17:54                           ` [PATCH] VM, x86, PAT: Change implementation of is_linear_pfn_mapping Pallipadi, Venkatesh
2009-03-11 22:09                             ` Frans Pop
2009-03-12  0:31                               ` Pallipadi, Venkatesh
2009-03-12  3:22                                 ` Pallipadi, Venkatesh
2009-03-12  5:45                                 ` Frans Pop
2009-03-12 18:59                                   ` Pallipadi, Venkatesh
2009-03-12 20:30                                     ` Frans Pop
2009-03-12 22:48                                       ` Pallipadi, Venkatesh
2009-03-13  0:36                                         ` Ingo Molnar
2009-03-13  0:45                                           ` [PATCH] VM, x86, PAT: Change is_linear_pfn_mapping to not use vm_pgoff Pallipadi, Venkatesh
2009-03-13  4:03                                             ` [tip:x86/urgent] " Pallipadi, Venkatesh
2009-03-13 16:25                                               ` Nick Piggin
2009-03-13 17:00                                                 ` Pallipadi, Venkatesh
2009-03-14  2:52                                                   ` Nick Piggin
2009-03-13 23:35                                                 ` [PATCH] Add a new vm flag to track full pfnmap at mmap Pallipadi, Venkatesh
2009-03-14  2:53                                                   ` Nick Piggin
2009-03-14  8:54                                                   ` [tip:x86/urgent] VM, x86, PAT: add " Pallipadi, Venkatesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090304060857.GA18318@linux-os.sc.intel.com \
    --to=venkatesh.pallipadi@intel.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=suresh.b.siddha@intel.com \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.