All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Frans Pop <elendil@planet.nl>,
	Jesse Barnes <jesse.barnes@intel.com>,
	linux-kernel@vger.kernel.org, "Packard,
	Keith" <keith.packard@intel.com>,
	Yinghai Lu <yhlu.kernel@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Hugh Dickins <hugh@veritas.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [git head] X86_PAT & mprotect
Date: Wed, 7 May 2008 09:02:17 +0200	[thread overview]
Message-ID: <20080507070217.GD32195@elte.hu> (raw)
In-Reply-To: <20080506224240.GA18706@linux-os.sc.intel.com>


* Venki Pallipadi <venkatesh.pallipadi@intel.com> wrote:

> There is a hole in mprotect, which lets the user to change the page 
> cache type bits by-passing the kernel reserve_memtype and free_memtype 
> wrappers. Fix the hole by not letting mprotect change the PAT bits.
> 
> Some versions of X used the mprotect hole to change caching type from 
> UC to WB, so that it can then use mtrr to program WC for that region 
> [1]. Change the mmap of pci space through /sys or /proc interfaces 
> from UC to UC_MINUS. With this change, X will not need to use mprotect 
> hole to get WC type.
> 
> [1] lkml.org/lkml/2008/4/16/369
> 
> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> 
> ---
>  arch/x86/pci/i386.c       |    4 +---
>  include/asm-x86/pgtable.h |    5 ++++-
>  include/linux/mm.h        |    1 +
>  mm/mmap.c                 |   13 +++++++++++++
>  mm/mprotect.c             |    4 +++-
>  5 files changed, 22 insertions(+), 5 deletions(-)

hm, that's one dangerous looking patch. (Cc:-ed more MM folks. I've 
attached the patch below for reference.)

the purpose of the fix itself seems to make some sense - we dont want 
mprotect() change the PAT bits in the pte from what they got populated 
with at fault or mmap time.

the pte_modify() change looks correct at first sight.

The _PAGE_PROT_PRESERVE_BITS solution looks a bit ugly (although we do 
have a couple of other similar #ifndefs in the MM already).

at minimum we should add vm_get_page_prot_preserve() as an inline 
function to mm.h if _PAGE_PROT_PRESERVE_BITS is not defined, and make it 
call vm_get_page_prot(). Also, vm_get_page_prot() in mm/mmap.c should 
probably be marked inline so that we'll have only a single function call 
[vm_get_page_prot() is trivial].

but i'm wondering why similar issues never came up on other 
architectures - i thought it would be rather common to have immutable 
pte details. So maybe i'm missing something here ...

	Ingo

------------------------------------->
Subject: generic, x86, PAT: fix mprotect
From: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date: Tue, 6 May 2008 15:42:40 -0700

There is a hole in mprotect, which lets the user to change the page
cache type bits by-passing the kernel reserve_memtype and free_memtype
wrappers. Fix the hole by not letting mprotect change the PAT bits.

Some versions of X used the mprotect hole to change caching type from UC to WB,
so that it can then use mtrr to program WC for that region [1]. Change the
mmap of pci space through /sys or /proc interfaces from UC to UC_MINUS.
With this change, X will not need to use mprotect hole to get WC type.

[1] lkml.org/lkml/2008/4/16/369

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/pci/i386.c       |    4 +---
 include/asm-x86/pgtable.h |    5 ++++-
 include/linux/mm.h        |    1 +
 mm/mmap.c                 |   13 +++++++++++++
 mm/mprotect.c             |    4 +++-
 5 files changed, 22 insertions(+), 5 deletions(-)

Index: linux-x86.q/arch/x86/pci/i386.c
===================================================================
--- linux-x86.q.orig/arch/x86/pci/i386.c
+++ linux-x86.q/arch/x86/pci/i386.c
@@ -301,15 +301,13 @@ int pci_mmap_page_range(struct pci_dev *
 	prot = pgprot_val(vma->vm_page_prot);
 	if (pat_wc_enabled && write_combine)
 		prot |= _PAGE_CACHE_WC;
-	else if (pat_wc_enabled)
+	else if (pat_wc_enabled || boot_cpu_data.x86 > 3)
 		/*
 		 * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
 		 * To avoid attribute conflicts, request UC MINUS here
 		 * aswell.
 		 */
 		prot |= _PAGE_CACHE_UC_MINUS;
-	else if (boot_cpu_data.x86 > 3)
-		prot |= _PAGE_CACHE_UC;
 
 	vma->vm_page_prot = __pgprot(prot);
 
Index: linux-x86.q/include/asm-x86/pgtable.h
===================================================================
--- linux-x86.q.orig/include/asm-x86/pgtable.h
+++ linux-x86.q/include/asm-x86/pgtable.h
@@ -66,6 +66,8 @@
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
 #define _PAGE_CACHE_UC		(_PAGE_PCD | _PAGE_PWT)
 
+#define _PAGE_PROT_PRESERVE_BITS	(_PAGE_CACHE_MASK)
+
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
 				 _PAGE_ACCESSED | _PAGE_NX)
@@ -289,7 +291,8 @@ static inline pte_t pte_modify(pte_t pte
 	 * Chop off the NX bit (if present), and add the NX portion of
 	 * the newprot (if present):
 	 */
-	val &= _PAGE_CHG_MASK & ~_PAGE_NX;
+	/* We also preserve PAT bits from existing pte */
+	val &= (_PAGE_CHG_MASK | _PAGE_PROT_PRESERVE_BITS)  & ~_PAGE_NX;
 	val |= pgprot_val(newprot) & __supported_pte_mask;
 
 	return __pte(val);
Index: linux-x86.q/include/linux/mm.h
===================================================================
--- linux-x86.q.orig/include/linux/mm.h
+++ linux-x86.q/include/linux/mm.h
@@ -1177,6 +1177,7 @@ static inline unsigned long vma_pages(st
 }
 
 pgprot_t vm_get_page_prot(unsigned long vm_flags);
+pgprot_t vm_get_page_prot_preserve(unsigned long vm_flags, pgprot_t oldprot);
 struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
Index: linux-x86.q/mm/mmap.c
===================================================================
--- linux-x86.q.orig/mm/mmap.c
+++ linux-x86.q/mm/mmap.c
@@ -77,6 +77,19 @@ pgprot_t vm_get_page_prot(unsigned long 
 }
 EXPORT_SYMBOL(vm_get_page_prot);
 
+#ifndef _PAGE_PROT_PRESERVE_BITS
+#define _PAGE_PROT_PRESERVE_BITS	0
+#endif
+
+pgprot_t vm_get_page_prot_preserve(unsigned long vm_flags, pgprot_t oldprot)
+{
+	pteval_t newprotval = pgprot_val(oldprot);
+
+	newprotval &= _PAGE_PROT_PRESERVE_BITS;
+	newprotval |= pgprot_val(vm_get_page_prot(vm_flags));
+	return __pgprot(newprotval);
+}
+
 int sysctl_overcommit_memory = OVERCOMMIT_GUESS;  /* heuristic overcommit */
 int sysctl_overcommit_ratio = 50;	/* default is 50% */
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
Index: linux-x86.q/mm/mprotect.c
===================================================================
--- linux-x86.q.orig/mm/mprotect.c
+++ linux-x86.q/mm/mprotect.c
@@ -192,7 +192,9 @@ success:
 	 * held in write mode.
 	 */
 	vma->vm_flags = newflags;
-	vma->vm_page_prot = vm_get_page_prot(newflags);
+	vma->vm_page_prot = vm_get_page_prot_preserve(newflags,
+							vma->vm_page_prot);
+
 	if (vma_wants_writenotify(vma)) {
 		vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED);
 		dirty_accountable = 1;

  reply	other threads:[~2008-05-07  7:02 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-02 19:22 [git head] Should X86_PAT really default to yes? Frans Pop
2008-05-02 19:37 ` Pallipadi, Venkatesh
2008-05-02 20:40   ` Jesse Barnes
2008-05-02 21:55     ` Pallipadi, Venkatesh
2008-05-02 22:07       ` Jesse Barnes
2008-05-04  7:10     ` Frans Pop
2008-05-04  9:04       ` Ingo Molnar
2008-05-04 20:23       ` Yinghai Lu
2008-05-05 16:55         ` Frans Pop
2008-05-05 17:00           ` Pallipadi, Venkatesh
2008-05-05 17:42             ` Yinghai Lu
2008-05-05 18:56             ` Frans Pop
2008-05-05 15:57       ` Jesse Barnes
2008-05-05 17:32         ` Frans Pop
2008-05-05 17:45           ` Jesse Barnes
2008-05-05 17:59             ` Pallipadi, Venkatesh
2008-05-05 18:59             ` Frans Pop
2008-05-05 19:04               ` fb layer & ioremap_wc Jesse Barnes
2008-05-05 19:30                 ` Frans Pop
2008-06-13 16:42                 ` Frans Pop
2008-06-13 16:42                   ` Frans Pop
2008-05-06 22:42           ` [git head] Should X86_PAT really default to yes? Venki Pallipadi
2008-05-07  7:02             ` Ingo Molnar [this message]
2008-05-07 19:18               ` [git head] X86_PAT & mprotect Hugh Dickins
2008-05-07 23:23                 ` Venki Pallipadi
2008-05-09 10:08                   ` Ingo Molnar
2008-05-09 20:05                     ` Venki Pallipadi
2008-05-09 20:09                       ` Venki Pallipadi
2008-05-09 20:48                         ` Hugh Dickins
2008-05-09 22:11                       ` Dave Airlie
2008-05-09 22:20                         ` Pallipadi, Venkatesh
2008-05-10  6:19                           ` Dave Airlie
2008-05-10  6:29                             ` Keith Packard
2008-05-10  5:45                         ` Keith Packard
2008-05-07 22:36               ` Venki Pallipadi
2008-05-25 15:08             ` [git head] Should X86_PAT really default to yes? Frans Pop

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080507070217.GD32195@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=elendil@planet.nl \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=jesse.barnes@intel.com \
    --cc=keith.packard@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=venkatesh.pallipadi@intel.com \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.