linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces
@ 2015-08-05 21:43 Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO Toshi Kani
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott

The PAT bit gets relocated to bit 12 when PUD/PMD mappings are used.  This
bit 12, however, is not covered by PTE_FLAGS_MASK, which is corrently used
for masking pfn and flags for all levels.  This patch-set updates pud/pmd
interfaces and multiple functions to handle the large PAT bit properly.

Patch 1/10-2/10 make changes necessary for patch 3/10 to use P?D_PAGE_MASK.

Patch 3/10 fixes pud/pmd interfaces to handle the PAT bit properly, and
patch 4/10 adds p?d_pgprot() interfaces for PUD/PMD.

Patch 5/10 fixes /sys/kernel/debug/kernel_page_tables to show the PAT bit
properly.

Patch 6/10-9/10 fix multiple functions to handle the large PAT bit properly.

Patch 10/10 fix the same pgprot handling in try_preserve_large_page() by
leveraging the changes made in patch 8/10.

Note, the PAT bit is first enabled in 4.2-rc1 with WT mappings.  The functions
fixed by patch 6/10-9/10 are not used with WT mappings yet.  These fixes will
protect them from future use with the PAT bit set.

---
v3:
 - Add patch 4/10 and 6/10-9/10 for multiple interfaces to handle the large
   PAT bit.
 - Add patch 10/10 to fix the same pgprot handling in
   try_preserve_large_page().

v2:
 - Change p?n_pfn() to handle the PAT bit. (Juergen Gross)
 - Mask pfn and flags with P?D_PAGE_MASK. (Juergen Gross)
 - Change p?d_page_vaddr() and p?d_page() to handle the PAT bit.

---
Toshi Kani (10):
  1/10 x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
  2/10 x86/asm: Move PUD_PAGE macros to page_types.h
  3/10 x86/asm: Fix pud/pmd interfaces to handle large PAT bit
  4/10 x86/asm: Add pud_pgprot() and pmd_pgprot()
  5/10 x86/mm: Fix page table dump to show PAT bit
  6/10 x86/mm: Fix slow_virt_to_phys() to handle large PAT bit
  7/10 x86/mm: Fix gup_huge_p?d() to handle large PAT bit
  8/10 x86/mm: Fix try_preserve_large_page() to handle large PAT bit
  9/10 x86/mm: Fix __split_large_page() to handle large PAT bit
 10/10 x86/mm: Fix the same pgprot handling in try_preserve_large_page()

---
 arch/x86/entry/vdso/vdso32/vclock_gettime.c |  2 +
 arch/x86/include/asm/page_64_types.h        |  3 --
 arch/x86/include/asm/page_types.h           |  3 ++
 arch/x86/include/asm/pgtable.h              | 18 ++++---
 arch/x86/include/asm/pgtable_types.h        | 40 +++++++++++++--
 arch/x86/mm/dump_pagetables.c               | 39 +++++++-------
 arch/x86/mm/gup.c                           | 18 +++----
 arch/x86/mm/pageattr.c                      | 79 ++++++++++++++++++-----------
 8 files changed, 131 insertions(+), 71 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-20 19:46   ` Thomas Gleixner
  2015-08-05 21:43 ` [PATCH v3 2/10] x86/asm: Move PUD_PAGE macros to page_types.h Toshi Kani
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32bit
kernel configuration by re-defining it to CONFIG_X86_32.  However,
it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels.
Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 as X86_PAE is not
set.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/entry/vdso/vdso32/vclock_gettime.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/entry/vdso/vdso32/vclock_gettime.c b/arch/x86/entry/vdso/vdso32/vclock_gettime.c
index 175cc72..87a86e0 100644
--- a/arch/x86/entry/vdso/vdso32/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vdso32/vclock_gettime.c
@@ -14,11 +14,13 @@
  */
 #undef CONFIG_64BIT
 #undef CONFIG_X86_64
+#undef CONFIG_PGTABLE_LEVELS
 #undef CONFIG_ILLEGAL_POINTER_VALUE
 #undef CONFIG_SPARSEMEM_VMEMMAP
 #undef CONFIG_NR_CPUS
 
 #define CONFIG_X86_32 1
+#define CONFIG_PGTABLE_LEVELS 2
 #define CONFIG_PAGE_OFFSET 0
 #define CONFIG_ILLEGAL_POINTER_VALUE 0
 #define CONFIG_NR_CPUS 1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/10] x86/asm: Move PUD_PAGE macros to page_types.h
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit Toshi Kani
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

PUD_SHIFT is defined according to a kernel configuration, which
allows it be commonly used by any kernels.  However, PUD_PAGE_SIZE
and PUD_PAGE_MASK, which are calculated from PUD_SHIFT, are defined
in page_64_types.h, which allows them be used by a 64-bit kernel
only.

Move PUD_PAGE_SIZE and PUD_PAGE_MASK to page_types.h so that they
can be used by any kernels as well.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/page_64_types.h |    3 ---
 arch/x86/include/asm/page_types.h    |    3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 4edd53b..4928cf0 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -26,9 +26,6 @@
 #define MCE_STACK 4
 #define N_EXCEPTION_STACKS 4  /* hw limit: 7 */
 
-#define PUD_PAGE_SIZE		(_AC(1, UL) << PUD_SHIFT)
-#define PUD_PAGE_MASK		(~(PUD_PAGE_SIZE-1))
-
 /*
  * Set __PAGE_OFFSET to the most negative possible address +
  * PGDIR_SIZE*16 (pgd slot 272).  The gap is to allow a space for a
diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h
index c7c712f..c5b7fb2 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -20,6 +20,9 @@
 #define PMD_PAGE_SIZE		(_AC(1, UL) << PMD_SHIFT)
 #define PMD_PAGE_MASK		(~(PMD_PAGE_SIZE-1))
 
+#define PUD_PAGE_SIZE		(_AC(1, UL) << PUD_SHIFT)
+#define PUD_PAGE_MASK		(~(PUD_PAGE_SIZE-1))
+
 #define HPAGE_SHIFT		PMD_SHIFT
 #define HPAGE_SIZE		(_AC(1,UL) << HPAGE_SHIFT)
 #define HPAGE_MASK		(~(HPAGE_SIZE - 1))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 2/10] x86/asm: Move PUD_PAGE macros to page_types.h Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-25  8:16   ` Thomas Gleixner
  2015-08-05 21:43 ` [PATCH v3 4/10] x86/asm: Add pud_pgprot() and pmd_pgprot() Toshi Kani
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

The PAT bit gets relocated to bit 12 when PUD and PMD mappings are
used.  This bit 12, however, is not covered by PTE_FLAGS_MASK, which
is corrently used for masking pfn and flags for all cases.

Fix pud/pmd interfaces to handle pfn and flags properly by using
P?D_PAGE_MASK when PUD/PMD mappings are used, i.e. PSE bit is set.

Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/pgtable.h       |   14 +++++++-----
 arch/x86/include/asm/pgtable_types.h |   40 +++++++++++++++++++++++++++++++---
 2 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 867da5b..0733ec7 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,12 +142,12 @@ static inline unsigned long pte_pfn(pte_t pte)
 
 static inline unsigned long pmd_pfn(pmd_t pmd)
 {
-	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
+	return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
 }
 
 static inline unsigned long pud_pfn(pud_t pud)
 {
-	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+	return (pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT;
 }
 
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
@@ -502,14 +502,15 @@ static inline int pmd_none(pmd_t pmd)
 
 static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 {
-	return (unsigned long)__va(pmd_val(pmd) & PTE_PFN_MASK);
+	return (unsigned long)__va(pmd_val(pmd) & pmd_pfn_mask(pmd));
 }
 
 /*
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pmd_page(pmd)	pfn_to_page((pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT)
+#define pmd_page(pmd)		\
+	pfn_to_page((pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT)
 
 /*
  * the pmd page can be thought of an array like this: pmd_t[PTRS_PER_PMD]
@@ -570,14 +571,15 @@ static inline int pud_present(pud_t pud)
 
 static inline unsigned long pud_page_vaddr(pud_t pud)
 {
-	return (unsigned long)__va((unsigned long)pud_val(pud) & PTE_PFN_MASK);
+	return (unsigned long)__va(pud_val(pud) & pud_pfn_mask(pud));
 }
 
 /*
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pud_page(pud)		pfn_to_page(pud_val(pud) >> PAGE_SHIFT)
+#define pud_page(pud)		\
+	pfn_to_page((pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT)
 
 /* Find an entry in the second-level page table.. */
 static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 13f310b..dd5b0aa 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -209,10 +209,10 @@ enum page_cache_mode {
 
 #include <linux/types.h>
 
-/* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
+/* Extracts the PFN from a (pte|pmd|pud|pgd)val_t of a 4KB page */
 #define PTE_PFN_MASK		((pteval_t)PHYSICAL_PAGE_MASK)
 
-/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */
+/* Extracts the flags from a (pte|pmd|pud|pgd)val_t of a 4KB page */
 #define PTE_FLAGS_MASK		(~PTE_PFN_MASK)
 
 typedef struct pgprot { pgprotval_t pgprot; } pgprot_t;
@@ -276,14 +276,46 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
 }
 #endif
 
+static inline pudval_t pud_pfn_mask(pud_t pud)
+{
+	if (native_pud_val(pud) & _PAGE_PSE)
+		return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+	else
+		return PTE_PFN_MASK;
+}
+
+static inline pudval_t pud_flags_mask(pud_t pud)
+{
+	if (native_pud_val(pud) & _PAGE_PSE)
+		return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK);
+	else
+		return ~PTE_PFN_MASK;
+}
+
 static inline pudval_t pud_flags(pud_t pud)
 {
-	return native_pud_val(pud) & PTE_FLAGS_MASK;
+	return native_pud_val(pud) & pud_flags_mask(pud);
+}
+
+static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
+{
+	if (native_pmd_val(pmd) & _PAGE_PSE)
+		return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+	else
+		return PTE_PFN_MASK;
+}
+
+static inline pmdval_t pmd_flags_mask(pmd_t pmd)
+{
+	if (native_pmd_val(pmd) & _PAGE_PSE)
+		return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK);
+	else
+		return ~PTE_PFN_MASK;
 }
 
 static inline pmdval_t pmd_flags(pmd_t pmd)
 {
-	return native_pmd_val(pmd) & PTE_FLAGS_MASK;
+	return native_pmd_val(pmd) & pmd_flags_mask(pmd);
 }
 
 static inline pte_t native_make_pte(pteval_t val)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/10] x86/asm: Add pud_pgprot() and pmd_pgprot()
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (2 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 5/10] x86/mm: Fix page table dump to show PAT bit Toshi Kani
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

pte_pgprot() returns a pgprot_t value by calling pte_flags().  Now
that pud_flags() and pmd_flags() work differently from pte_flags(),
define pud_pgprot() and pmd_pgprot() for PUD/PMD.  Also update
pte_pgprot() to remove the masking with PTE_FLAGS_MASK, which is
unnecessary since pte_flags() takes care of it.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/pgtable.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 0733ec7..59fc341 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -379,7 +379,9 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
 	return __pgprot(preservebits | addbits);
 }
 
-#define pte_pgprot(x) __pgprot(pte_flags(x) & PTE_FLAGS_MASK)
+#define pte_pgprot(x) __pgprot(pte_flags(x))
+#define pmd_pgprot(x) __pgprot(pmd_flags(x))
+#define pud_pgprot(x) __pgprot(pud_flags(x))
 
 #define canon_pgprot(p) __pgprot(massage_pgprot(p))
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/10] x86/mm: Fix page table dump to show PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (3 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 4/10] x86/asm: Add pud_pgprot() and pmd_pgprot() Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 6/10] x86/mm: Fix slow_virt_to_phys() to handle large " Toshi Kani
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

/sys/kernel/debug/kernel_page_tables does not show the PAT bit for
PUD/PMD mappings.  This is because walk_pud_level(), walk_pmd_level()
and note_page() mask the flags with PTE_FLAGS_MASK, which does not
cover their PAT bit, _PAGE_PAT_LARGE.

Fix it by replacing the use of PTE_FLAGS_MASK with p?d_flags(),
which masks the flags properly.

Change also to show the PAT bit as "PAT" to be consistent with
other bits.

Reported-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Robert Elliott <elliott@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/dump_pagetables.c |   39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index f0cedf3..71ab2d7 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -155,7 +155,7 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg)
 			pt_dump_cont_printf(m, dmsg, "    ");
 		if ((level == 4 && pr & _PAGE_PAT) ||
 		    ((level == 3 || level == 2) && pr & _PAGE_PAT_LARGE))
-			pt_dump_cont_printf(m, dmsg, "pat ");
+			pt_dump_cont_printf(m, dmsg, "PAT ");
 		else
 			pt_dump_cont_printf(m, dmsg, "    ");
 		if (pr & _PAGE_GLOBAL)
@@ -198,8 +198,8 @@ static void note_page(struct seq_file *m, struct pg_state *st,
 	 * we have now. "break" is either changing perms, levels or
 	 * address space marker.
 	 */
-	prot = pgprot_val(new_prot) & PTE_FLAGS_MASK;
-	cur = pgprot_val(st->current_prot) & PTE_FLAGS_MASK;
+	prot = pgprot_val(new_prot);
+	cur = pgprot_val(st->current_prot);
 
 	if (!st->level) {
 		/* First entry */
@@ -269,13 +269,13 @@ static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr,
 {
 	int i;
 	pte_t *start;
+	pgprotval_t prot;
 
 	start = (pte_t *) pmd_page_vaddr(addr);
 	for (i = 0; i < PTRS_PER_PTE; i++) {
-		pgprot_t prot = pte_pgprot(*start);
-
+		prot = pte_flags(*start);
 		st->current_address = normalize_addr(P + i * PTE_LEVEL_MULT);
-		note_page(m, st, prot, 4);
+		note_page(m, st, __pgprot(prot), 4);
 		start++;
 	}
 }
@@ -287,18 +287,19 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr,
 {
 	int i;
 	pmd_t *start;
+	pgprotval_t prot;
 
 	start = (pmd_t *) pud_page_vaddr(addr);
 	for (i = 0; i < PTRS_PER_PMD; i++) {
 		st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT);
 		if (!pmd_none(*start)) {
-			pgprotval_t prot = pmd_val(*start) & PTE_FLAGS_MASK;
-
-			if (pmd_large(*start) || !pmd_present(*start))
+			if (pmd_large(*start) || !pmd_present(*start)) {
+				prot = pmd_flags(*start);
 				note_page(m, st, __pgprot(prot), 3);
-			else
+			} else {
 				walk_pte_level(m, st, *start,
 					       P + i * PMD_LEVEL_MULT);
+			}
 		} else
 			note_page(m, st, __pgprot(0), 3);
 		start++;
@@ -318,19 +319,20 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
 {
 	int i;
 	pud_t *start;
+	pgprotval_t prot;
 
 	start = (pud_t *) pgd_page_vaddr(addr);
 
 	for (i = 0; i < PTRS_PER_PUD; i++) {
 		st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT);
 		if (!pud_none(*start)) {
-			pgprotval_t prot = pud_val(*start) & PTE_FLAGS_MASK;
-
-			if (pud_large(*start) || !pud_present(*start))
+			if (pud_large(*start) || !pud_present(*start)) {
+				prot = pud_flags(*start);
 				note_page(m, st, __pgprot(prot), 2);
-			else
+			} else {
 				walk_pmd_level(m, st, *start,
 					       P + i * PUD_LEVEL_MULT);
+			}
 		} else
 			note_page(m, st, __pgprot(0), 2);
 
@@ -351,6 +353,7 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
 #else
 	pgd_t *start = swapper_pg_dir;
 #endif
+	pgprotval_t prot;
 	int i;
 	struct pg_state st = {};
 
@@ -362,13 +365,13 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
 	for (i = 0; i < PTRS_PER_PGD; i++) {
 		st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
 		if (!pgd_none(*start)) {
-			pgprotval_t prot = pgd_val(*start) & PTE_FLAGS_MASK;
-
-			if (pgd_large(*start) || !pgd_present(*start))
+			if (pgd_large(*start) || !pgd_present(*start)) {
+				prot = pgd_flags(*start);
 				note_page(m, &st, __pgprot(prot), 1);
-			else
+			} else {
 				walk_pud_level(m, &st, *start,
 					       i * PGD_LEVEL_MULT);
+			}
 		} else
 			note_page(m, &st, __pgprot(0), 1);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 6/10] x86/mm: Fix slow_virt_to_phys() to handle large PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (4 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 5/10] x86/mm: Fix page table dump to show PAT bit Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 7/10] x86/mm: Fix gup_huge_p?d() " Toshi Kani
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

slow_virt_to_phys() calls lookup_address() to obtain *pte and
its level.  It then calls pte_pfn() to obtain the PFN for any
level.  This does not result the correct PFN when the large
PAT bit is set because pte_pfn() does not mask the large PAT bit
properly for PUD/PMD.

Fix slow_virt_to_phys() to use pud_pfn() and pmd_pfn() according
to the level.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/pageattr.c |   24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 727158c..ecc24e5 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -415,18 +415,28 @@ pmd_t *lookup_pmd_address(unsigned long address)
 phys_addr_t slow_virt_to_phys(void *__virt_addr)
 {
 	unsigned long virt_addr = (unsigned long)__virt_addr;
-	phys_addr_t phys_addr;
-	unsigned long offset;
+	unsigned long phys_addr, offset;
 	enum pg_level level;
-	unsigned long pmask;
 	pte_t *pte;
 
 	pte = lookup_address(virt_addr, &level);
 	BUG_ON(!pte);
-	pmask = page_level_mask(level);
-	offset = virt_addr & ~pmask;
-	phys_addr = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT;
-	return (phys_addr | offset);
+
+	switch (level) {
+	case PG_LEVEL_1G:
+		phys_addr = pud_pfn(*(pud_t *)pte) << PAGE_SHIFT;
+		offset = virt_addr & ~PUD_PAGE_MASK;
+		break;
+	case PG_LEVEL_2M:
+		phys_addr = pmd_pfn(*(pmd_t *)pte) << PAGE_SHIFT;
+		offset = virt_addr & ~PMD_PAGE_MASK;
+		break;
+	default:
+		phys_addr = pte_pfn(*pte) << PAGE_SHIFT;
+		offset = virt_addr & ~PAGE_MASK;
+	}
+
+	return (phys_addr_t)(phys_addr | offset);
 }
 EXPORT_SYMBOL_GPL(slow_virt_to_phys);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 7/10] x86/mm: Fix gup_huge_p?d() to handle large PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (5 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 6/10] x86/mm: Fix slow_virt_to_phys() to handle large " Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 8/10] x86/mm: Fix try_preserve_large_page() " Toshi Kani
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

gup_huge_pud() and gup_huge_pmd() cast *pud and *pmd to *pte,
and use pte_xxx() interfaces to obtain the flags and PFN.
However, the pte_xxx() interface does not handle the large
PAT bit properly for PUD/PMD.

Fix gup_huge_pud() and gup_huge_pmd() to use pud_xxx() and
pmd_xxx() interfaces according to their type.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/gup.c |   18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
index 81bf3d2..ae9a37b 100644
--- a/arch/x86/mm/gup.c
+++ b/arch/x86/mm/gup.c
@@ -118,21 +118,20 @@ static noinline int gup_huge_pmd(pmd_t pmd, unsigned long addr,
 		unsigned long end, int write, struct page **pages, int *nr)
 {
 	unsigned long mask;
-	pte_t pte = *(pte_t *)&pmd;
 	struct page *head, *page;
 	int refs;
 
 	mask = _PAGE_PRESENT|_PAGE_USER;
 	if (write)
 		mask |= _PAGE_RW;
-	if ((pte_flags(pte) & mask) != mask)
+	if ((pmd_flags(pmd) & mask) != mask)
 		return 0;
 	/* hugepages are never "special" */
-	VM_BUG_ON(pte_flags(pte) & _PAGE_SPECIAL);
-	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+	VM_BUG_ON(pmd_flags(pmd) & _PAGE_SPECIAL);
+	VM_BUG_ON(!pfn_valid(pmd_pfn(pmd)));
 
 	refs = 0;
-	head = pte_page(pte);
+	head = pmd_page(pmd);
 	page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
 	do {
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);
@@ -195,21 +194,20 @@ static noinline int gup_huge_pud(pud_t pud, unsigned long addr,
 		unsigned long end, int write, struct page **pages, int *nr)
 {
 	unsigned long mask;
-	pte_t pte = *(pte_t *)&pud;
 	struct page *head, *page;
 	int refs;
 
 	mask = _PAGE_PRESENT|_PAGE_USER;
 	if (write)
 		mask |= _PAGE_RW;
-	if ((pte_flags(pte) & mask) != mask)
+	if ((pud_flags(pud) & mask) != mask)
 		return 0;
 	/* hugepages are never "special" */
-	VM_BUG_ON(pte_flags(pte) & _PAGE_SPECIAL);
-	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
+	VM_BUG_ON(pud_flags(pud) & _PAGE_SPECIAL);
+	VM_BUG_ON(!pfn_valid(pud_pfn(pud)));
 
 	refs = 0;
-	head = pte_page(pte);
+	head = pud_page(pud);
 	page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
 	do {
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 8/10] x86/mm: Fix try_preserve_large_page() to handle large PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (6 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 7/10] x86/mm: Fix gup_huge_p?d() " Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 9/10] x86/mm: Fix __split_large_page() " Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 10/10] x86/mm: Fix the same pgprot handling in try_preserve_large_page() Toshi Kani
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

try_preserve_large_page() is called from __change_page_attr() to
change the map attribute by preserving the large page.  This
function uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not
handle the large PAT bit properly.

Fix try_preserve_large_page() to use corresponding p?d_pfn() and
p?d_pgprot().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/pageattr.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index ecc24e5..2724755 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -469,7 +469,7 @@ static int
 try_preserve_large_page(pte_t *kpte, unsigned long address,
 			struct cpa_data *cpa)
 {
-	unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn;
+	unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
 	pte_t new_pte, old_pte, *tmp;
 	pgprot_t old_prot, new_prot, req_prot;
 	int i, do_split = 1;
@@ -489,17 +489,21 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 	switch (level) {
 	case PG_LEVEL_2M:
-#ifdef CONFIG_X86_64
+		old_prot = pmd_pgprot(*(pmd_t *)kpte);
+		old_pfn = pmd_pfn(*(pmd_t *)kpte);
+		break;
 	case PG_LEVEL_1G:
-#endif
-		psize = page_level_size(level);
-		pmask = page_level_mask(level);
+		old_prot = pud_pgprot(*(pud_t *)kpte);
+		old_pfn = pud_pfn(*(pud_t *)kpte);
 		break;
 	default:
 		do_split = -EINVAL;
 		goto out_unlock;
 	}
 
+	psize = page_level_size(level);
+	pmask = page_level_mask(level);
+
 	/*
 	 * Calculate the number of pages, which fit into this large
 	 * page starting at address:
@@ -515,7 +519,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * up accordingly.
 	 */
 	old_pte = *kpte;
-	old_prot = req_prot = pgprot_large_2_4k(pte_pgprot(old_pte));
+	old_prot = req_prot = pgprot_large_2_4k(old_prot);
 
 	pgprot_val(req_prot) &= ~pgprot_val(cpa->mask_clr);
 	pgprot_val(req_prot) |= pgprot_val(cpa->mask_set);
@@ -541,10 +545,10 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	req_prot = canon_pgprot(req_prot);
 
 	/*
-	 * old_pte points to the large page base address. So we need
+	 * old_pfn points to the large page base pfn. So we need
 	 * to add the offset of the virtual address:
 	 */
-	pfn = pte_pfn(old_pte) + ((address & (psize - 1)) >> PAGE_SHIFT);
+	pfn = old_pfn + ((address & (psize - 1)) >> PAGE_SHIFT);
 	cpa->pfn = pfn;
 
 	new_prot = static_protections(req_prot, address, pfn);
@@ -555,7 +559,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * the pages in the range we try to preserve:
 	 */
 	addr = address & pmask;
-	pfn = pte_pfn(old_pte);
+	pfn = old_pfn;
 	for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
 		pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
 
@@ -585,7 +589,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 		 * The address is aligned and the number of pages
 		 * covers the full page.
 		 */
-		new_pte = pfn_pte(pte_pfn(old_pte), new_prot);
+		new_pte = pfn_pte(old_pfn, new_prot);
 		__set_pmd_pte(kpte, address, new_pte);
 		cpa->flags |= CPA_FLUSHTLB;
 		do_split = 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 9/10] x86/mm: Fix __split_large_page() to handle large PAT bit
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (7 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 8/10] x86/mm: Fix try_preserve_large_page() " Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  2015-08-05 21:43 ` [PATCH v3 10/10] x86/mm: Fix the same pgprot handling in try_preserve_large_page() Toshi Kani
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

__split_large_page() is called from __change_page_attr() to change
the map attribute by splitting the large page into smaller pages.
This function uses pte_pfn() and pte_pgprot() for PUD/PMD, which
do not handle the large PAT bit properly.

Fix __split_large_page() to use corresponding p?d_pfn() and p?d_pgprot().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/pageattr.c |   31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 2724755..d0e40ed 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -606,7 +606,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		   struct page *base)
 {
 	pte_t *pbase = (pte_t *)page_address(base);
-	unsigned long pfn, pfninc = 1;
+	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
 	pgprot_t ref_prot;
@@ -623,26 +623,33 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	}
 
 	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
-	ref_prot = pte_pgprot(pte_clrhuge(*kpte));
 
-	/* promote PAT bit to correct position */
-	if (level == PG_LEVEL_2M)
+	switch (level) {
+	case PG_LEVEL_2M:
+		ref_prot = pmd_pgprot(*(pmd_t *)kpte);
+		/* clear PSE and promote PAT bit to correct position */
 		ref_prot = pgprot_large_2_4k(ref_prot);
+		ref_pfn = pmd_pfn(*(pmd_t *)kpte);
+		break;
 
-#ifdef CONFIG_X86_64
-	if (level == PG_LEVEL_1G) {
+	case PG_LEVEL_1G:
+		ref_prot = pud_pgprot(*(pud_t *)kpte);
+		ref_pfn = pud_pfn(*(pud_t *)kpte);
 		pfninc = PMD_PAGE_SIZE >> PAGE_SHIFT;
+
 		/*
-		 * Set the PSE flags only if the PRESENT flag is set
+		 * Clear the PSE flags if the PRESENT flag is not set
 		 * otherwise pmd_present/pmd_huge will return true
 		 * even on a non present pmd.
 		 */
-		if (pgprot_val(ref_prot) & _PAGE_PRESENT)
-			pgprot_val(ref_prot) |= _PAGE_PSE;
-		else
+		if (!(pgprot_val(ref_prot) & _PAGE_PRESENT))
 			pgprot_val(ref_prot) &= ~_PAGE_PSE;
+		break;
+
+	default:
+		spin_unlock(&pgd_lock);
+		return 1;
 	}
-#endif
 
 	/*
 	 * Set the GLOBAL flags only if the PRESENT flag is set
@@ -658,7 +665,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	/*
 	 * Get the target pfn from the original entry:
 	 */
-	pfn = pte_pfn(*kpte);
+	pfn = ref_pfn;
 	for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc)
 		set_pte(&pbase[i], pfn_pte(pfn, canon_pgprot(ref_prot)));
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 10/10] x86/mm: Fix the same pgprot handling in try_preserve_large_page()
  2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
                   ` (8 preceding siblings ...)
  2015-08-05 21:43 ` [PATCH v3 9/10] x86/mm: Fix __split_large_page() " Toshi Kani
@ 2015-08-05 21:43 ` Toshi Kani
  9 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-05 21:43 UTC (permalink / raw)
  To: hpa, tglx, mingo
  Cc: akpm, bp, linux-mm, linux-kernel, x86, jgross, konrad.wilk,
	elliott, Toshi Kani

try_preserve_large_page() checks if new_prot is the same as
old_prot.  If so, it simply sets do_split to 0, and returns
with no-operation.  However, old_prot is set as a 4KB pgprot
value while new_prot is a large page pgprot value.

Now that old_prot is initially set from p?d_pgprot() as a
large page pgprot value, fix it by not overwriting old_prot
with a 4KB pgprot value.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/mm/pageattr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index d0e40ed..6f9b885 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -519,7 +519,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * up accordingly.
 	 */
 	old_pte = *kpte;
-	old_prot = req_prot = pgprot_large_2_4k(old_prot);
+	req_prot = pgprot_large_2_4k(old_prot);
 
 	pgprot_val(req_prot) &= ~pgprot_val(cpa->mask_clr);
 	pgprot_val(req_prot) |= pgprot_val(cpa->mask_set);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
  2015-08-05 21:43 ` [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO Toshi Kani
@ 2015-08-20 19:46   ` Thomas Gleixner
  2015-08-20 23:04     ` Toshi Kani
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2015-08-20 19:46 UTC (permalink / raw)
  To: Toshi Kani
  Cc: hpa, mingo, akpm, bp, linux-mm, linux-kernel, x86, jgross,
	konrad.wilk, elliott

On Wed, 5 Aug 2015, Toshi Kani wrote:

> In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32bit
> kernel configuration by re-defining it to CONFIG_X86_32.  However,
> it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels.
> Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 as X86_PAE is not
> set.

You fail to explain WHY this is required. I have not yet spotted any
code in vclock_gettime.c which is affected by this.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
  2015-08-20 19:46   ` Thomas Gleixner
@ 2015-08-20 23:04     ` Toshi Kani
  2015-08-24 21:48       ` Toshi Kani
  0 siblings, 1 reply; 16+ messages in thread
From: Toshi Kani @ 2015-08-20 23:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: hpa, mingo, akpm, bp, linux-mm, linux-kernel, x86, jgross,
	konrad.wilk, elliott

On 8/20/2015 1:46 PM, Thomas Gleixner wrote:
> On Wed, 5 Aug 2015, Toshi Kani wrote:
>
>> In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32bit
>> kernel configuration by re-defining it to CONFIG_X86_32.  However,
>> it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels.
>> Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 as X86_PAE is not
>> set.
> You fail to explain WHY this is required. I have not yet spotted any
> code in vclock_gettime.c which is affected by this.

Sorry about that.  Without this patch 01, applying patch 02 & 03 causes 
the following compile errors in vclock_gettime.c.  This is because it 
includes pgtable_type.h (see blow), which now requires PUD_SHIFT and 
PMD_SHIFT defined properly.  In case of X86_32, pgtable_type.h includes 
pgtable_nopud.h and pgtable-nopmd.h, which define these SHIFTs when 
CONFIG_PGTABLE_LEVEL is set to 2 (or 3 if PAE is also defined).

In file included from ./arch/x86/include/asm/paravirt_types.h:44:0,
                  from ./arch/x86/include/asm/ptrace.h:71,
                  from ./arch/x86/include/asm/alternative.h:8,
                  from ./arch/x86/include/asm/bitops.h:16,
                  from include/linux/bitops.h:36,
                  from include/linux/kernel.h:10,
                  from include/linux/list.h:8,
                  from include/linux/preempt.h:10,
                  from include/linux/spinlock.h:50,
                  from include/linux/seqlock.h:35,
                  from include/linux/time.h:5,
                  from include/uapi/linux/timex.h:56,
                  from include/linux/timex.h:56,
                  from include/linux/clocksource.h:12,
                  from ./arch/x86/include/asm/vgtod.h:5,
                  from arch/x86/entry/vdso/vdso32/../vclock_gettime.c:15,
                  from arch/x86/entry/vdso/vdso32/vclock_gettime.c:30:
./arch/x86/include/asm/pgtable_types.h: In function pud_pfn_maska?>>
./arch/x86/include/asm/pgtable_types.h:282:23: error: PUD_SHIFTa?>> 
undeclared (first use in this function)
    return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
                        ^
./arch/x86/include/asm/pgtable_types.h:282:23: note: each undeclared 
identifier is reported only once for each function it appears in
./arch/x86/include/asm/pgtable_types.h: In function pud_flags_maska?>>
./arch/x86/include/asm/pgtable_types.h:290:25: error: PUD_SHIFTa?>> 
undeclared (first use in this function)
    return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK);
                          ^
./arch/x86/include/asm/pgtable_types.h: In function pmd_pfn_maska?>>
./arch/x86/include/asm/pgtable_types.h:303:23: error: PMD_SHIFTa?>> 
undeclared (first use in this function)
    return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
                        ^
./arch/x86/include/asm/pgtable_types.h: In function pmd_flags_maska?>>
./arch/x86/include/asm/pgtable_types.h:311:25: error: PMD_SHIFTa?>> 
undeclared (first use in this function)
    return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK);
                          ^
scripts/Makefile.build:258: recipe for target 
'arch/x86/entry/vdso/vdso32/vclock_gettime.o' failed
make[3]: *** [arch/x86/entry/vdso/vdso32/vclock_gettime.o] Error 1
make[3]: *** Waiting for unfinished jobs....

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
  2015-08-20 23:04     ` Toshi Kani
@ 2015-08-24 21:48       ` Toshi Kani
  0 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-24 21:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: hpa, mingo, akpm, bp, linux-mm, linux-kernel, x86, jgross,
	konrad.wilk, elliott

On Thu, 2015-08-20 at 17:04 -0600, Toshi Kani wrote:
> On 8/20/2015 1:46 PM, Thomas Gleixner wrote:
> > On Wed, 5 Aug 2015, Toshi Kani wrote:
> > 
> > > In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32bit
> > > kernel configuration by re-defining it to CONFIG_X86_32.  However,
> > > it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels.
> > > Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 as X86_PAE is not
> > > set.
> > You fail to explain WHY this is required. I have not yet spotted any
> > code in vclock_gettime.c which is affected by this.
> 
> Sorry about that.  Without this patch 01, applying patch 02 & 03 causes 
> the following compile errors in vclock_gettime.c.  This is because it 
> includes pgtable_type.h (see blow), which now requires PUD_SHIFT and 
> PMD_SHIFT defined properly.  In case of X86_32, pgtable_type.h includes 
> pgtable_nopud.h and pgtable-nopmd.h, which define these SHIFTs when 
> CONFIG_PGTABLE_LEVEL is set to 2 (or 3 if PAE is also defined).
>  :

Attached is patch 01/10 with updated descriptions.  The rest of the patchset
still applies cleanly.

Please let me know if you have any further comments.
Thanks,
-Toshi

----
Subject: [PATCH v3 UPDATE 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit
VDSO

In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32-bit
non-PAE kernel configuration by re-defining it to CONFIG_X86_32.
However, it does not re-define CONFIG_PGTABLE_LEVELS leaving it
as 4 levels.

This mismatch leads <asm/pgtable_type.h> to NOT include <asm-generic/
pgtable-nopud.h> and <asm-generic/pgtable-nopmd.h>, which will cause
compile errors when a later patch enhances <asm/pgtable_type.h> to
use PUD_SHIFT and PMD_SHIFT.  These -nopud & -nopmd headers define
these SHIFTs for the 32-bit non-PAE kernel.

Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 levels.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/entry/vdso/vdso32/vclock_gettime.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/entry/vdso/vdso32/vclock_gettime.c
b/arch/x86/entry/vdso/vdso32/vclock_gettime.c
index 175cc72..87a86e0 100644
--- a/arch/x86/entry/vdso/vdso32/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vdso32/vclock_gettime.c
@@ -14,11 +14,13 @@
  */
 #undef CONFIG_64BIT
 #undef CONFIG_X86_64
+#undef CONFIG_PGTABLE_LEVELS
 #undef CONFIG_ILLEGAL_POINTER_VALUE
 #undef CONFIG_SPARSEMEM_VMEMMAP
 #undef CONFIG_NR_CPUS
 
 #define CONFIG_X86_32 1
+#define CONFIG_PGTABLE_LEVELS 2
 #define CONFIG_PAGE_OFFSET 0
 #define CONFIG_ILLEGAL_POINTER_VALUE 0
 #define CONFIG_NR_CPUS 1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit
  2015-08-05 21:43 ` [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit Toshi Kani
@ 2015-08-25  8:16   ` Thomas Gleixner
  2015-08-25 14:15     ` Toshi Kani
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2015-08-25  8:16 UTC (permalink / raw)
  To: Toshi Kani
  Cc: hpa, mingo, akpm, bp, linux-mm, linux-kernel, x86, jgross,
	konrad.wilk, elliott

On Wed, 5 Aug 2015, Toshi Kani wrote:

> The PAT bit gets relocated to bit 12 when PUD and PMD mappings are
> used.  This bit 12, however, is not covered by PTE_FLAGS_MASK, which
> is corrently used for masking pfn and flags for all cases.
> 
> Fix pud/pmd interfaces to handle pfn and flags properly by using
> P?D_PAGE_MASK when PUD/PMD mappings are used, i.e. PSE bit is set.

Can you please split that into a patch introducing and describing the
new mask helper functions and a second one making use of it?

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit
  2015-08-25  8:16   ` Thomas Gleixner
@ 2015-08-25 14:15     ` Toshi Kani
  0 siblings, 0 replies; 16+ messages in thread
From: Toshi Kani @ 2015-08-25 14:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: hpa, mingo, akpm, bp, linux-mm, linux-kernel, x86, jgross,
	konrad.wilk, elliott

On Tue, 2015-08-25 at 10:16 +0200, Thomas Gleixner wrote:
> On Wed, 5 Aug 2015, Toshi Kani wrote:
> 
> > The PAT bit gets relocated to bit 12 when PUD and PMD mappings are
> > used.  This bit 12, however, is not covered by PTE_FLAGS_MASK, which
> > is corrently used for masking pfn and flags for all cases.
> > 
> > Fix pud/pmd interfaces to handle pfn and flags properly by using
> > P?D_PAGE_MASK when PUD/PMD mappings are used, i.e. PSE bit is set.
> 
> Can you please split that into a patch introducing and describing the
> new mask helper functions and a second one making use of it?

Will do.  I will send out v4 patchset today with this update (and the patch
01 update). 

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-08-25 14:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-05 21:43 [PATCH v3 0/10] x86/mm: Handle large PAT bit in pud/pmd interfaces Toshi Kani
2015-08-05 21:43 ` [PATCH v3 1/10] x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO Toshi Kani
2015-08-20 19:46   ` Thomas Gleixner
2015-08-20 23:04     ` Toshi Kani
2015-08-24 21:48       ` Toshi Kani
2015-08-05 21:43 ` [PATCH v3 2/10] x86/asm: Move PUD_PAGE macros to page_types.h Toshi Kani
2015-08-05 21:43 ` [PATCH v3 3/10] x86/asm: Fix pud/pmd interfaces to handle large PAT bit Toshi Kani
2015-08-25  8:16   ` Thomas Gleixner
2015-08-25 14:15     ` Toshi Kani
2015-08-05 21:43 ` [PATCH v3 4/10] x86/asm: Add pud_pgprot() and pmd_pgprot() Toshi Kani
2015-08-05 21:43 ` [PATCH v3 5/10] x86/mm: Fix page table dump to show PAT bit Toshi Kani
2015-08-05 21:43 ` [PATCH v3 6/10] x86/mm: Fix slow_virt_to_phys() to handle large " Toshi Kani
2015-08-05 21:43 ` [PATCH v3 7/10] x86/mm: Fix gup_huge_p?d() " Toshi Kani
2015-08-05 21:43 ` [PATCH v3 8/10] x86/mm: Fix try_preserve_large_page() " Toshi Kani
2015-08-05 21:43 ` [PATCH v3 9/10] x86/mm: Fix __split_large_page() " Toshi Kani
2015-08-05 21:43 ` [PATCH v3 10/10] x86/mm: Fix the same pgprot handling in try_preserve_large_page() Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).