linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
@ 2012-09-20 15:56 Will Deacon
  2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

After laughing at Ben H during LPC when it emerged that the PPC PROT_NONE
protection bits don't prevent kernel access to the protected pages, I
looked at the ARM code and, to my dismay, found that we have the same
problem when not using domains.

This patch series addresses the issue with the following points worth
noting:

	- We use the last available software bit (11) for 2-level PTEs.
	  Whilst this is somewhat of a pity, I can't think of a better
	  reason to allocate it than to fix an outstanding bug.

	- There is a horrible bug where the protection map values are
	  currently truncated with LPAE -- that is also fixed in this
	  series.

	- We add a new L_PTE_VALID bit to allow for present, faulting
	  pages in LPAE (where the software bits overlap with the
	  hardware bits).

Taken against -rc6 and tested for both 2 and 3 levels of page table with
a simple application reading an mprotect(PROT_NONE) region over a pipe.

All comments welcome,

Will


Will Deacon (4):
  ARM: mm: use pteval_t to represent page protection values
  ARM: mm: don't use the access flag permissions mechanism for classic
    MMU
  ARM: mm: introduce L_PTE_VALID for page table entries
  ARM: mm: introduce present, faulting entries for PAGE_NONE

 arch/arm/include/asm/pgtable-2level.h |    2 ++
 arch/arm/include/asm/pgtable-3level.h |    4 +++-
 arch/arm/include/asm/pgtable.h        |   10 ++++------
 arch/arm/mm/mmu.c                     |    2 +-
 arch/arm/mm/proc-macros.S             |    4 ++++
 arch/arm/mm/proc-v7-2level.S          |   10 +++++++---
 arch/arm/mm/proc-v7-3level.S          |    5 ++++-
 7 files changed, 25 insertions(+), 12 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
  2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
  2012-09-20 18:28   ` Nicolas Pitre
  2012-09-21  8:48   ` Catalin Marinas
  2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

When updating the page protection map after calculating the user_pgprot
value, the base protection map is temporarily stored in an unsigned long
type, causing truncation of the protection bits when LPAE is enabled.
This effectively means that calls to mprotect() will corrupt the upper
page attributes, clearing the XN bit unconditionally.

This patch uses pteval_t to store the intermediate protection values,
preserving the upper bits for 64-bit descriptors.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/mmu.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index c2fa21d..b68b531 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -498,7 +498,7 @@ static void __init build_mem_type_table(void)
 #endif
 
 	for (i = 0; i < 16; i++) {
-		unsigned long v = pgprot_val(protection_map[i]);
+		pteval_t v = pgprot_val(protection_map[i]);
 		protection_map[i] = __pgprot(v | user_pgprot);
 	}
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU
  2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
  2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
  2012-09-21  8:40   ` Catalin Marinas
  2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

The simplified access permissions model is not used for the classic MMU
translation regime, so ensure that it is turned off in the sctlr prior
to turning on address translation for ARMv7.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/proc-v7-2level.S |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index fd045e7..e37600b 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -161,11 +161,11 @@ ENDPROC(cpu_v7_set_pte_ext)
 	 *  TFR   EV X F   I D LR    S
 	 * .EEE ..EE PUI. .T.T 4RVI ZWRS BLDP WCAM
 	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
-	 *    1    0 110       0011 1100 .111 1101 < we want
+	 *   01    0 110       0011 1100 .111 1101 < we want
 	 */
 	.align	2
 	.type	v7_crval, #object
 v7_crval:
-	crval	clear=0x0120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
+	crval	clear=0x2120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
 
 	.previous
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
  2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
  2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
  2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
  2012-09-20 22:21   ` Russell King - ARM Linux
  2012-09-20 15:56 ` [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE Will Deacon
  2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
  4 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

For long-descriptor translation table formats, the ARMv7 architecture
defines the last two bits of the second- and third-level descriptors to
be:

	x0b	- Invalid
	01b	- Block (second-level), Reserved (third-level)
	11b	- Table (second-level), Page (third-level)

This allows us to define L_PTE_PRESENT as (3 << 0) and use this value to
create ptes directly. However, when determining whether a given pte
value is present in the low-level page table accessors, we only need to
check the least significant bit of the descriptor, allowing us to write
faulting, present entries which are required for PROT_NONE mappings.

This patch introduces L_PTE_VALID, which can be used to test whether a
pte should fault, and updates the low-level page table accessors
accordingly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    1 +
 arch/arm/include/asm/pgtable-3level.h |    3 ++-
 arch/arm/include/asm/pgtable.h        |    4 +---
 arch/arm/mm/proc-v7-2level.S          |    2 +-
 arch/arm/mm/proc-v7-3level.S          |    2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 2317a71..c44a1ec 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -115,6 +115,7 @@
  * The PTE table pointer refers to the hardware entries; the "Linux"
  * entries are stored 1024 bytes below.
  */
+#define L_PTE_VALID		(_AT(pteval_t, 1) << 0)		/* Valid */
 #define L_PTE_PRESENT		(_AT(pteval_t, 1) << 0)
 #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 1)
 #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)	/* only when !PRESENT */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index b249035..e32311a 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -67,7 +67,8 @@
  * These bits overlap with the hardware bits but the naming is preserved for
  * consistency with the classic page table format.
  */
-#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Valid */
+#define L_PTE_VALID		(_AT(pteval_t, 1) << 0)		/* Valid */
+#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Present */
 #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
 #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
 #define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 41dc31f..df206c1 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pte_exec(pte)		(!(pte_val(pte) & L_PTE_XN))
 #define pte_special(pte)	(0)
 
-#define pte_present_user(pte) \
-	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
-	 (L_PTE_PRESENT | L_PTE_USER))
+#define pte_present_user(pte)  (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
 
 #if __LINUX_ARM_ARCH__ < 6
 static inline void __sync_icache_dcache(pte_t pteval)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index e37600b..e755e9f 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -100,7 +100,7 @@ ENTRY(cpu_v7_set_pte_ext)
 	orrne	r3, r3, #PTE_EXT_XN
 
 	tst	r1, #L_PTE_YOUNG
-	tstne	r1, #L_PTE_PRESENT
+	tstne	r1, #L_PTE_VALID
 	moveq	r3, #0
 
  ARM(	str	r3, [r0, #2048]! )
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 8de0f1d..d23d067 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -65,7 +65,7 @@ ENDPROC(cpu_v7_switch_mm)
  */
 ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
-	tst	r2, #L_PTE_PRESENT
+	tst	r2, #L_PTE_VALID
 	beq	1f
 	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
 	orreq	r2, #L_PTE_RDONLY
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE
  2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
                   ` (2 preceding siblings ...)
  2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
  2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
  4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
  To: linux-arm-kernel

PROT_NONE mappings apply the page protection attributes defined by _P000
which translate to PAGE_NONE for ARM. These attributes specify an XN,
RDONLY pte that is inaccessible to userspace. However, on kernels
configured without support for domains, such a pte *is* accessible to
the kernel and can be read via get_user, allowing tasks to read
PROT_NONE pages via syscalls such as read/write over a pipe.

This patch introduces a new software pte flag, L_PTE_NONE, that is set
to identify faulting, present entries.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/pgtable-2level.h |    1 +
 arch/arm/include/asm/pgtable-3level.h |    1 +
 arch/arm/include/asm/pgtable.h        |    6 +++---
 arch/arm/mm/proc-macros.S             |    4 ++++
 arch/arm/mm/proc-v7-2level.S          |    4 ++++
 arch/arm/mm/proc-v7-3level.S          |    3 +++
 6 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index c44a1ec..f97ee02 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -124,6 +124,7 @@
 #define L_PTE_USER		(_AT(pteval_t, 1) << 8)
 #define L_PTE_XN		(_AT(pteval_t, 1) << 9)
 #define L_PTE_SHARED		(_AT(pteval_t, 1) << 10)	/* shared(v6), coherent(xsc3) */
+#define L_PTE_NONE		(_AT(pteval_t, 1) << 11)
 
 /*
  * These are the memory types, defined to be compatible with
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index e32311a..a3f3792 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -77,6 +77,7 @@
 #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
 #define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
 #define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+#define L_PTE_NONE		(_AT(pteval_t, 1) << 57)	/* PROT_NONE */
 
 /*
  * To be used in assembly code with the upper page attributes.
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index df206c1..7e21903 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -73,7 +73,7 @@ extern pgprot_t		pgprot_kernel;
 
 #define _MOD_PROT(p, b)	__pgprot(pgprot_val(p) | (b))
 
-#define PAGE_NONE		_MOD_PROT(pgprot_user, L_PTE_XN | L_PTE_RDONLY)
+#define PAGE_NONE		_MOD_PROT(pgprot_user, L_PTE_XN | L_PTE_RDONLY | L_PTE_NONE)
 #define PAGE_SHARED		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_XN)
 #define PAGE_SHARED_EXEC	_MOD_PROT(pgprot_user, L_PTE_USER)
 #define PAGE_COPY		_MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY | L_PTE_XN)
@@ -83,7 +83,7 @@ extern pgprot_t		pgprot_kernel;
 #define PAGE_KERNEL		_MOD_PROT(pgprot_kernel, L_PTE_XN)
 #define PAGE_KERNEL_EXEC	pgprot_kernel
 
-#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN)
+#define __PAGE_NONE		__pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
 #define __PAGE_SHARED		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
 #define __PAGE_SHARED_EXEC	__pgprot(_L_PTE_DEFAULT | L_PTE_USER)
 #define __PAGE_COPY		__pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_RDONLY | L_PTE_XN)
@@ -240,7 +240,7 @@ static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER;
+	const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER | L_PTE_NONE;
 	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
 	return pte;
 }
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 2d8ff3a..58b201c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -167,6 +167,10 @@
 	tst	r1, #L_PTE_YOUNG
 	tstne	r1, #L_PTE_PRESENT
 	moveq	r3, #0
+#ifndef CONFIG_CPU_USE_DOMAINS
+	tstne	r1, #L_PTE_NONE
+	movne	r3, #0
+#endif
 
 	str	r3, [r0]
 	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index e755e9f..6d98c13 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -101,6 +101,10 @@ ENTRY(cpu_v7_set_pte_ext)
 
 	tst	r1, #L_PTE_YOUNG
 	tstne	r1, #L_PTE_VALID
+#ifndef CONFIG_CPU_USE_DOMAINS
+	eorne	r1, r1, #L_PTE_NONE
+	tstne	r1, #L_PTE_NONE
+#endif
 	moveq	r3, #0
 
  ARM(	str	r3, [r0, #2048]! )
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index d23d067..7b56386 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -67,6 +67,9 @@ ENTRY(cpu_v7_set_pte_ext)
 #ifdef CONFIG_MMU
 	tst	r2, #L_PTE_VALID
 	beq	1f
+	tst	r3, #1 << (57 - 32)		@ L_PTE_NONE
+	bicne	r2, #L_PTE_VALID
+	bne	1f
 	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
 	orreq	r2, #L_PTE_RDONLY
 1:	strd	r2, r3, [r0]
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
  2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
@ 2012-09-20 18:28   ` Nicolas Pitre
  2012-09-20 21:07     ` Will Deacon
  2012-09-21  8:48   ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Nicolas Pitre @ 2012-09-20 18:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 20 Sep 2012, Will Deacon wrote:

> When updating the page protection map after calculating the user_pgprot
> value, the base protection map is temporarily stored in an unsigned long
> type, causing truncation of the protection bits when LPAE is enabled.
> This effectively means that calls to mprotect() will corrupt the upper
> page attributes, clearing the XN bit unconditionally.
> 
> This patch uses pteval_t to store the intermediate protection values,
> preserving the upper bits for 64-bit descriptors.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Acked-by: Nicolas Pitre <nico@linaro.org>

Maybe CC to stable?

Are you sure there are no other occurences of this elsewhere?  
Unfortunately, STRICT_MM_TYPECHECKS can't help here.

> ---
>  arch/arm/mm/mmu.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index c2fa21d..b68b531 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -498,7 +498,7 @@ static void __init build_mem_type_table(void)
>  #endif
>  
>  	for (i = 0; i < 16; i++) {
> -		unsigned long v = pgprot_val(protection_map[i]);
> +		pteval_t v = pgprot_val(protection_map[i]);
>  		protection_map[i] = __pgprot(v | user_pgprot);
>  	}
>  
> -- 
> 1.7.4.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
  2012-09-20 18:28   ` Nicolas Pitre
@ 2012-09-20 21:07     ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 21:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On Thu, Sep 20, 2012 at 07:28:14PM +0100, Nicolas Pitre wrote:
> On Thu, 20 Sep 2012, Will Deacon wrote:
> 
> > When updating the page protection map after calculating the user_pgprot
> > value, the base protection map is temporarily stored in an unsigned long
> > type, causing truncation of the protection bits when LPAE is enabled.
> > This effectively means that calls to mprotect() will corrupt the upper
> > page attributes, clearing the XN bit unconditionally.
> > 
> > This patch uses pteval_t to store the intermediate protection values,
> > preserving the upper bits for 64-bit descriptors.
> > 
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> 
> Acked-by: Nicolas Pitre <nico@linaro.org>

Thanks.

> Maybe CC to stable?

Of course.

> Are you sure there are no other occurences of this elsewhere?  
> Unfortunately, STRICT_MM_TYPECHECKS can't help here.

I grepped for lines containing `pgprot' and `long' and didn't find any
other problematic code. Everything else seems to use the pgprot_t types
consistently.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
  2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
                   ` (3 preceding siblings ...)
  2012-09-20 15:56 ` [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE Will Deacon
@ 2012-09-20 22:12 ` Russell King - ARM Linux
  2012-09-20 22:23   ` Benjamin Herrenschmidt
  2012-09-21  9:32   ` Will Deacon
  4 siblings, 2 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2012-09-20 22:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 20, 2012 at 04:56:41PM +0100, Will Deacon wrote:
> Hello,
> 
> After laughing at Ben H during LPC when it emerged that the PPC PROT_NONE
> protection bits don't prevent kernel access to the protected pages, I
> looked at the ARM code and, to my dismay, found that we have the same
> problem when not using domains.
> 
> This patch series addresses the issue with the following points worth
> noting:
> 
> 	- We use the last available software bit (11) for 2-level PTEs.
> 	  Whilst this is somewhat of a pity, I can't think of a better
> 	  reason to allocate it than to fix an outstanding bug.

But you don't - you've defined it to be bit 0, which is the same as the
present bit, and clearing the present bit is not a good idea as it turns
the entry into something like a swap entry.

So, let's go back and restart the analysis (and document it).  What
happens is we call pte_modify() on each present page, setting the new
protection - __PAGE_NONE.

__PAGE_NONE is defined as: _L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN

which translates to: present | young | read only | xn - with no user
bit set.  No user bit set means that userspace accesses, get_user()
and put_user() must fail when the page is accessed.

Now, this results in the pages being present (and really are present
in userspace) but with no accessibility allowed for user-mode accesses,
and read-only for kernel-mode accesses.  (The permissions valules
for ARMv7 are: APX=1 AP1=0 AP0=1).

Now, this happens fine when DOMAINS are enabled, because we use the
T-bit operations to ensure that the accesses are checked against
userspace permissions.

However, when DOMAINS are disabled, we use standard loads/stores, which
check against kernel permissions, and so _any_ PROT_NONE mmap() will be
readable by any of the kernel user access functions.

So, on ARMv5 and below, these bits get translated as...

L_PTE_RDONLY	L_PTE_USER	AP
1		0		00	(user n/a, system r/o)
0		0		01	(user n/a, system r/w)
1		1		10	(user r/o, system r/w)
0		1		11	(user r/w, system r/w)

and this is fine because we always use domains (we must _never_ not use
domains here, because the available permissions that the architecture
gives us do not allow it.)

On ARMv6+ with domains:

L_PTE_RDONLY	L_PTE_USER	AP
1		0		101	(user n/a, system r/o)
0		0		001	(user n/a, system r/w)
1		1		010	(user r/o, system r/w)
0		1		011	(user r/w, system r/w)

which is also fine, as user accesses are performed using T-bit loads and
stores.

On ARMv6+ without domains, only one of these changes:

L_PTE_RDONLY	L_PTE_USER	AP
1		0		101	(user n/a, system r/o)
0		0		001	(user n/a, system r/w)
1		1		111	(user r/o, system r/o)
0		1		011	(user r/w, system r/w)

However, now we perform user accesses using kernel-mode loads/stores,
so the access permissions which are checked are the 'system'
permissions, and as we can see from the above, when there is no state
when the kernel is denied access to the page.

Why not make system access permissions reflect user access permissions?
Well, that's easy to answer - you want your kernel data and code to be
protected against reading/writing by userspace... so we need a
combination of permissions to permit those encodings - and not
surprisingly, clearing the L_PTE_USER bit gives us that.

What wasn't thought about properly was the effect of removing domains
and converting the kernel-initiated userspace accesses to use kernel-mode
accesses.  Yes, the obvious was changed (so that user r/o also became
kernel r/o) but that's not enough.

The only solution I can see is to have L_PTE_KERNEL which indicates
whether this is a kernel mapping.  If both L_PTE_KERNEL and L_PTE_USER
are clear, then we zero out the hardware PTE entry - and that would be
the case for a userspace PROT_NONE entry.  Otherwise, we fall back to
our current behaviour.

So, essentially we change from ignoring the PTE value when

	(val & (PRESENT | YOUNG)) != (PRESENT | YOUNG))

to:

	(val & (PRESENT | YOUNG | USER)) != (PRESENT | YOUNG | USER)) &&
	(val & (PRESENT | YOUNG | KERNEL)) != (PRESENT | YOUNG | KERNEL))

That'll make the set_pte assembly more horrid, but I guess that's the
price you pay for removing useful features which architecture folk don't
like from CPUs... and it gets more horrid because you can't encode some
of those bit patterns with the standard 8-bit and shift opcode
representation.

Really fun bug, but it needs more thought about how to solve it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
  2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
@ 2012-09-20 22:21   ` Russell King - ARM Linux
  2012-09-21  9:22     ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2012-09-20 22:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 20, 2012 at 04:56:44PM +0100, Will Deacon wrote:
> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
> index 2317a71..c44a1ec 100644
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> @@ -115,6 +115,7 @@
>   * The PTE table pointer refers to the hardware entries; the "Linux"
>   * entries are stored 1024 bytes below.
>   */
> +#define L_PTE_VALID		(_AT(pteval_t, 1) << 0)		/* Valid */
>  #define L_PTE_PRESENT		(_AT(pteval_t, 1) << 0)
>  #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 1)
>  #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)	/* only when !PRESENT */
> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
> index b249035..e32311a 100644
> --- a/arch/arm/include/asm/pgtable-3level.h
> +++ b/arch/arm/include/asm/pgtable-3level.h
> @@ -67,7 +67,8 @@
>   * These bits overlap with the hardware bits but the naming is preserved for
>   * consistency with the classic page table format.
>   */
> -#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Valid */
> +#define L_PTE_VALID		(_AT(pteval_t, 1) << 0)		/* Valid */
> +#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Present */
>  #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
>  #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
>  #define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 41dc31f..df206c1 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
>  #define pte_exec(pte)		(!(pte_val(pte) & L_PTE_XN))
>  #define pte_special(pte)	(0)
>  
> -#define pte_present_user(pte) \
> -	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
> -	 (L_PTE_PRESENT | L_PTE_USER))
> +#define pte_present_user(pte)  (pte_present(pte) && (pte_val(pte) & L_PTE_USER))

This causes the compiler to do the "present" test and then, only if that
evaluates true, to do the "user" test.  That's not what we want, we want
the compiler to combine the two tests into one (which it should be able
to do so.)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
  2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
@ 2012-09-20 22:23   ` Benjamin Herrenschmidt
  2012-09-21  9:32   ` Will Deacon
  1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-20 22:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-09-20 at 23:12 +0100, Russell King - ARM Linux wrote:

 .../...

> That'll make the set_pte assembly more horrid, but I guess that's the
> price you pay for removing useful features which architecture folk don't
> like from CPUs... and it gets more horrid because you can't encode some
> of those bit patterns with the standard 8-bit and shift opcode
> representation.
> 
> Really fun bug, but it needs more thought about how to solve it.

Yeah, fun isn't it ? :-)

The x86 way to solve it is simple, though ugly ... They use a SW bit
called "PROTNONE". A PROT_NONE page gets that instead of the HW valid
bit and pte_present() test if any of them is set.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU
  2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
@ 2012-09-21  8:40   ` Catalin Marinas
  0 siblings, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2012-09-21  8:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 20, 2012 at 04:56:43PM +0100, Will Deacon wrote:
> The simplified access permissions model is not used for the classic MMU
> translation regime, so ensure that it is turned off in the sctlr prior
> to turning on address translation for ARMv7.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
  2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
  2012-09-20 18:28   ` Nicolas Pitre
@ 2012-09-21  8:48   ` Catalin Marinas
  1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2012-09-21  8:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 20, 2012 at 04:56:42PM +0100, Will Deacon wrote:
> When updating the page protection map after calculating the user_pgprot
> value, the base protection map is temporarily stored in an unsigned long
> type, causing truncation of the protection bits when LPAE is enabled.
> This effectively means that calls to mprotect() will corrupt the upper
> page attributes, clearing the XN bit unconditionally.
> 
> This patch uses pteval_t to store the intermediate protection values,
> preserving the upper bits for 64-bit descriptors.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
  2012-09-20 22:21   ` Russell King - ARM Linux
@ 2012-09-21  9:22     ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-21  9:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Thu, Sep 20, 2012 at 11:21:05PM +0100, Russell King - ARM Linux wrote:
> On Thu, Sep 20, 2012 at 04:56:44PM +0100, Will Deacon wrote:
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index 41dc31f..df206c1 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> >  #define pte_exec(pte)		(!(pte_val(pte) & L_PTE_XN))
> >  #define pte_special(pte)	(0)
> >  
> > -#define pte_present_user(pte) \
> > -	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
> > -	 (L_PTE_PRESENT | L_PTE_USER))
> > +#define pte_present_user(pte)  (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
> 
> This causes the compiler to do the "present" test and then, only if that
> evaluates true, to do the "user" test.  That's not what we want, we want
> the compiler to combine the two tests into one (which it should be able
> to do so.)

This is tricky for LPAE because PTEs with the following suffixes must all be
treated as present:

	...1xxxx01	/* L_PTE_USER | L_PTE_VALID -- will be used by hugetlb */
	...1xxxx10	/* L_PTE_USER | L_PTE_PRESENT[1] */
	...1xxxx11	/* L_PTE_USER | L_PTE_PRESENT */

so you need to check that (a) L_PTE_USER is set and (b) that at least one
of the bottom two bits is set. How about something like:

	((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) > L_PTE_USER)

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
  2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
  2012-09-20 22:23   ` Benjamin Herrenschmidt
@ 2012-09-21  9:32   ` Will Deacon
  1 sibling, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-21  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 20, 2012 at 11:12:15PM +0100, Russell King - ARM Linux wrote:
> On Thu, Sep 20, 2012 at 04:56:41PM +0100, Will Deacon wrote:
> > Hello,
> > 
> > After laughing at Ben H during LPC when it emerged that the PPC PROT_NONE
> > protection bits don't prevent kernel access to the protected pages, I
> > looked at the ARM code and, to my dismay, found that we have the same
> > problem when not using domains.
> > 
> > This patch series addresses the issue with the following points worth
> > noting:
> > 
> > 	- We use the last available software bit (11) for 2-level PTEs.
> > 	  Whilst this is somewhat of a pity, I can't think of a better
> > 	  reason to allocate it than to fix an outstanding bug.
> 
> But you don't - you've defined it to be bit 0, which is the same as the
> present bit, and clearing the present bit is not a good idea as it turns
> the entry into something like a swap entry.

Sorry, my patches may have been in a slightly confusing order. L_PTE_VALID
is a red herring, required simply to allow present, faulting entries. The
new software bit is L_PTE_NONE (I can rename this to L_PTE_KERNEL if you
prefer) and is introduced in patch 4/4.

> So, let's go back and restart the analysis (and document it).  What
> happens is we call pte_modify() on each present page, setting the new
> protection - __PAGE_NONE.

Thanks for this in-depth analysis, you've hit the problem on the head and
it's good to have that archived.

> On ARMv6+ without domains, only one of these changes:
> 
> L_PTE_RDONLY	L_PTE_USER	AP
> 1		0		101	(user n/a, system r/o)
> 0		0		001	(user n/a, system r/w)
> 1		1		111	(user r/o, system r/o)
> 0		1		011	(user r/w, system r/w)
> 
> However, now we perform user accesses using kernel-mode loads/stores,
> so the access permissions which are checked are the 'system'
> permissions, and as we can see from the above, when there is no state
> when the kernel is denied access to the page.
> 
> Why not make system access permissions reflect user access permissions?
> Well, that's easy to answer - you want your kernel data and code to be
> protected against reading/writing by userspace... so we need a
> combination of permissions to permit those encodings - and not
> surprisingly, clearing the L_PTE_USER bit gives us that.
> 
> What wasn't thought about properly was the effect of removing domains
> and converting the kernel-initiated userspace accesses to use kernel-mode
> accesses.  Yes, the obvious was changed (so that user r/o also became
> kernel r/o) but that's not enough.

Agreed.

> The only solution I can see is to have L_PTE_KERNEL which indicates
> whether this is a kernel mapping.  If both L_PTE_KERNEL and L_PTE_USER
> are clear, then we zero out the hardware PTE entry - and that would be
> the case for a userspace PROT_NONE entry.  Otherwise, we fall back to
> our current behaviour.
> 
> So, essentially we change from ignoring the PTE value when
> 
> 	(val & (PRESENT | YOUNG)) != (PRESENT | YOUNG))
> 
> to:
> 
> 	(val & (PRESENT | YOUNG | USER)) != (PRESENT | YOUNG | USER)) &&
> 	(val & (PRESENT | YOUNG | KERNEL)) != (PRESENT | YOUNG | KERNEL))

This is basically what I've done in the final patch, just with a different
name for the bit. The reason I added the VALID stuff before is because
the software bits overlap the hardware bits for LPAE, so you can't just
write #0 to the pte if you want it be treated as present. Instead, I make
use of L_PTE_PRESENT being 2 bits wide and just clear the bottom bit (which
causes the hardware to fault on access). For 2 levels, L_PTE_VALID ==
L_PTE_PRESENT so everything is as before.

You *could* do this slightly differently and check for L_PTE_NONE in
pte_present, but this doesn't help us for section mappings in userspace,
which I have working with hugetlb (I'll post this soon).

> That'll make the set_pte assembly more horrid, but I guess that's the
> price you pay for removing useful features which architecture folk don't
> like from CPUs... and it gets more horrid because you can't encode some
> of those bit patterns with the standard 8-bit and shift opcode
> representation.

It's not too bad. PROT_NONE pages are pretty rare, so you can just add the
check after the YOUNG and PRESENT checks with the appropriate condition
flags.

Cheers,

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-09-21  9:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
2012-09-20 18:28   ` Nicolas Pitre
2012-09-20 21:07     ` Will Deacon
2012-09-21  8:48   ` Catalin Marinas
2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
2012-09-21  8:40   ` Catalin Marinas
2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
2012-09-20 22:21   ` Russell King - ARM Linux
2012-09-21  9:22     ` Will Deacon
2012-09-20 15:56 ` [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE Will Deacon
2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
2012-09-20 22:23   ` Benjamin Herrenschmidt
2012-09-21  9:32   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).