From: Venki Pallipadi <venkatesh.pallipadi@intel.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Venki Pallipadi <venkatesh.pallipadi@intel.com>,
Frans Pop <elendil@planet.nl>,
Jesse Barnes <jesse.barnes@intel.com>,
linux-kernel@vger.kernel.org, "Packard,
Keith" <keith.packard@intel.com>,
Yinghai Lu <yhlu.kernel@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Nick Piggin <npiggin@suse.de>
Subject: Re: [git head] X86_PAT & mprotect
Date: Wed, 7 May 2008 16:23:26 -0700 [thread overview]
Message-ID: <20080507232326.GB10757@linux-os.sc.intel.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0805071941140.18916@blonde.site>
On Wed, May 07, 2008 at 08:18:45PM +0100, Hugh Dickins wrote:
> On Wed, 7 May 2008, Ingo Molnar wrote:
> > the pte_modify() change looks correct at first sight.
> >
> > The _PAGE_PROT_PRESERVE_BITS solution looks a bit ugly (although we do
> > have a couple of other similar #ifndefs in the MM already).
> >
> > at minimum we should add vm_get_page_prot_preserve() as an inline
> > function to mm.h if _PAGE_PROT_PRESERVE_BITS is not defined, and make it
> > call vm_get_page_prot(). Also, vm_get_page_prot() in mm/mmap.c should
> > probably be marked inline so that we'll have only a single function call
> > [vm_get_page_prot() is trivial].
>
> I've tried doing it slightly differently below,
> don't know whether you'll consider it an improvement or not.
Hugh: Thanks for looking into this. Yes. I like your modified patch. Simpler
and smaller.
> > ------------------------------------->
> > Subject: generic, x86, PAT: fix mprotect
> > From: Venki Pallipadi <venkatesh.pallipadi@intel.com>
> > Date: Tue, 6 May 2008 15:42:40 -0700
> >
> > There is a hole in mprotect, which lets the user to change the page
> > cache type bits by-passing the kernel reserve_memtype and free_memtype
> > wrappers. Fix the hole by not letting mprotect change the PAT bits.
>
> Maybe say "defect" rather than "hole"? I stupidly thought for a
> while that you were talking about some gap in the address space.
Yes. we can call it a loophole or defect.
> > @@ -289,7 +291,8 @@ static inline pte_t pte_modify(pte_t pte
> > * Chop off the NX bit (if present), and add the NX portion of
> > * the newprot (if present):
> > */
> > - val &= _PAGE_CHG_MASK & ~_PAGE_NX;
>
> (Nothing to do with your patch, but that's a silly line, isn't it?
> _PAGE_NX isn't in _PAGE_CHG_MASK so it doesn't need further masking out.)
Yes. That PAGE_NX part indeed looks bogus.
> > + /* We also preserve PAT bits from existing pte */
> > + val &= (_PAGE_CHG_MASK | _PAGE_PROT_PRESERVE_BITS) & ~_PAGE_NX;
> > val |= pgprot_val(newprot) & __supported_pte_mask;
> >
> > return __pte(val);
>
> An odd thing is that you end up keeping the PAT bits from the old
> pte, and the PAT bits from vm_page_prot. So if they can get out
> of synch, you'd be oring them together; and if they can't get out
> of synch, then do you need a change here?
They cannot get out of sync. So, we can do without the change here. But, I felt
it is safer to take things from old pte.
> I've followed that dup in my version below, but it's probably wrong.
> Yes, you want the PAT bits in vm_page_prot (so various places
> will automatically set them), but I think here it's more correct
> to take the PAT bits from the old pte and mask out those from the
> vm_page_prot?
Yes. We can mask out these bits from vm_page_prot, that way we will not
'or' them if they get out of sync.
If your below modified patch is OK, I can send an incremental patch to
change pte_modify() to inherit PAT bits only from oldpte.
Thanks,
Venki
> Here's my more minimal alternative; but I think we need to clarify
> on that apparent duplication of the PAT bits in pte_modify.
>
> arch/x86/pci/i386.c | 4 +---
> include/asm-x86/pgtable.h | 14 +++++++++++++-
> mm/mprotect.c | 11 ++++++++++-
> 3 files changed, 24 insertions(+), 5 deletions(-)
>
> --- 2.6.26-rc1/arch/x86/pci/i386.c 2008-05-03 21:54:41.000000000 +0100
> +++ linux/arch/x86/pci/i386.c 2008-05-07 17:14:53.000000000 +0100
> @@ -301,15 +301,13 @@ int pci_mmap_page_range(struct pci_dev *
> prot = pgprot_val(vma->vm_page_prot);
> if (pat_wc_enabled && write_combine)
> prot |= _PAGE_CACHE_WC;
> - else if (pat_wc_enabled)
> + else if (pat_wc_enabled || boot_cpu_data.x86 > 3)
> /*
> * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
> * To avoid attribute conflicts, request UC MINUS here
> * aswell.
> */
> prot |= _PAGE_CACHE_UC_MINUS;
> - else if (boot_cpu_data.x86 > 3)
> - prot |= _PAGE_CACHE_UC;
>
> vma->vm_page_prot = __pgprot(prot);
>
> --- 2.6.26-rc1/include/asm-x86/pgtable.h 2008-05-03 21:55:10.000000000 +0100
> +++ linux/include/asm-x86/pgtable.h 2008-05-07 19:09:42.000000000 +0100
> @@ -57,7 +57,8 @@
> #define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | \
> _PAGE_DIRTY)
>
> -#define _PAGE_CHG_MASK (PTE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY)
> +#define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PWT | _PAGE_PCD | \
> + _PAGE_ACCESSED | _PAGE_DIRTY)
>
> #define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
> #define _PAGE_CACHE_WB (0)
> @@ -294,6 +295,17 @@ static inline pte_t pte_modify(pte_t pte
> return __pte(val);
> }
>
> +/*
> + * mprotect needs to preserve PAT bits when updating vm_page_prot
> + */
> +#define pgprot_modify pgprot_modify
> +static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
> +{
> + pgprotval_t preservebits = pgprot_val(oldprot) & _PAGE_CHG_MASK;
> + pgprotval_t addbits = pgprot_val(newprot);
> + return __pgprot(preservebits | addbits);
> +}
> +
> #define pte_pgprot(x) __pgprot(pte_val(x) & (0xfff | _PAGE_NX))
>
> #define canon_pgprot(p) __pgprot(pgprot_val(p) & __supported_pte_mask)
> --- 2.6.26-rc1/mm/mprotect.c 2008-01-24 22:58:37.000000000 +0000
> +++ linux/mm/mprotect.c 2008-05-07 18:42:07.000000000 +0100
> @@ -26,6 +26,13 @@
> #include <asm/cacheflush.h>
> #include <asm/tlbflush.h>
>
> +#ifndef pgprot_modify
> +static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
> +{
> + return newprot;
> +}
> +#endif
> +
> static void change_pte_range(struct mm_struct *mm, pmd_t *pmd,
> unsigned long addr, unsigned long end, pgprot_t newprot,
> int dirty_accountable)
> @@ -192,7 +199,9 @@ success:
> * held in write mode.
> */
> vma->vm_flags = newflags;
> - vma->vm_page_prot = vm_get_page_prot(newflags);
> + vma->vm_page_prot = pgprot_modify(vma->vm_page_prot,
> + vm_get_page_prot(newflags));
> +
> if (vma_wants_writenotify(vma)) {
> vma->vm_page_prot = vm_get_page_prot(newflags & ~VM_SHARED);
> dirty_accountable = 1;
next prev parent reply other threads:[~2008-05-07 23:23 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-02 19:22 [git head] Should X86_PAT really default to yes? Frans Pop
2008-05-02 19:37 ` Pallipadi, Venkatesh
2008-05-02 20:40 ` Jesse Barnes
2008-05-02 21:55 ` Pallipadi, Venkatesh
2008-05-02 22:07 ` Jesse Barnes
2008-05-04 7:10 ` Frans Pop
2008-05-04 9:04 ` Ingo Molnar
2008-05-04 20:23 ` Yinghai Lu
2008-05-05 16:55 ` Frans Pop
2008-05-05 17:00 ` Pallipadi, Venkatesh
2008-05-05 17:42 ` Yinghai Lu
2008-05-05 18:56 ` Frans Pop
2008-05-05 15:57 ` Jesse Barnes
2008-05-05 17:32 ` Frans Pop
2008-05-05 17:45 ` Jesse Barnes
2008-05-05 17:59 ` Pallipadi, Venkatesh
2008-05-05 18:59 ` Frans Pop
2008-05-05 19:04 ` fb layer & ioremap_wc Jesse Barnes
2008-06-13 16:42 ` Frans Pop
2008-05-06 22:42 ` [git head] Should X86_PAT really default to yes? Venki Pallipadi
2008-05-07 7:02 ` [git head] X86_PAT & mprotect Ingo Molnar
2008-05-07 19:18 ` Hugh Dickins
2008-05-07 23:23 ` Venki Pallipadi [this message]
2008-05-09 10:08 ` Ingo Molnar
2008-05-09 20:05 ` Venki Pallipadi
2008-05-09 20:09 ` Venki Pallipadi
2008-05-09 20:48 ` Hugh Dickins
2008-05-09 22:11 ` Dave Airlie
2008-05-09 22:20 ` Pallipadi, Venkatesh
2008-05-10 6:19 ` Dave Airlie
2008-05-10 6:29 ` Keith Packard
2008-05-10 5:45 ` Keith Packard
2008-05-07 22:36 ` Venki Pallipadi
2008-05-25 15:08 ` [git head] Should X86_PAT really default to yes? Frans Pop
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080507232326.GB10757@linux-os.sc.intel.com \
--to=venkatesh.pallipadi@intel.com \
--cc=akpm@linux-foundation.org \
--cc=elendil@planet.nl \
--cc=hpa@zytor.com \
--cc=hugh@veritas.com \
--cc=jesse.barnes@intel.com \
--cc=keith.packard@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox