* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
2012-09-20 18:28 ` Nicolas Pitre
2012-09-21 8:48 ` Catalin Marinas
2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
` (3 subsequent siblings)
4 siblings, 2 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
To: linux-arm-kernel
When updating the page protection map after calculating the user_pgprot
value, the base protection map is temporarily stored in an unsigned long
type, causing truncation of the protection bits when LPAE is enabled.
This effectively means that calls to mprotect() will corrupt the upper
page attributes, clearing the XN bit unconditionally.
This patch uses pteval_t to store the intermediate protection values,
preserving the upper bits for 64-bit descriptors.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/mm/mmu.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index c2fa21d..b68b531 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -498,7 +498,7 @@ static void __init build_mem_type_table(void)
#endif
for (i = 0; i < 16; i++) {
- unsigned long v = pgprot_val(protection_map[i]);
+ pteval_t v = pgprot_val(protection_map[i]);
protection_map[i] = __pgprot(v | user_pgprot);
}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
@ 2012-09-20 18:28 ` Nicolas Pitre
2012-09-20 21:07 ` Will Deacon
2012-09-21 8:48 ` Catalin Marinas
1 sibling, 1 reply; 14+ messages in thread
From: Nicolas Pitre @ 2012-09-20 18:28 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 20 Sep 2012, Will Deacon wrote:
> When updating the page protection map after calculating the user_pgprot
> value, the base protection map is temporarily stored in an unsigned long
> type, causing truncation of the protection bits when LPAE is enabled.
> This effectively means that calls to mprotect() will corrupt the upper
> page attributes, clearing the XN bit unconditionally.
>
> This patch uses pteval_t to store the intermediate protection values,
> preserving the upper bits for 64-bit descriptors.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Maybe CC to stable?
Are you sure there are no other occurences of this elsewhere?
Unfortunately, STRICT_MM_TYPECHECKS can't help here.
> ---
> arch/arm/mm/mmu.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index c2fa21d..b68b531 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -498,7 +498,7 @@ static void __init build_mem_type_table(void)
> #endif
>
> for (i = 0; i < 16; i++) {
> - unsigned long v = pgprot_val(protection_map[i]);
> + pteval_t v = pgprot_val(protection_map[i]);
> protection_map[i] = __pgprot(v | user_pgprot);
> }
>
> --
> 1.7.4.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
2012-09-20 18:28 ` Nicolas Pitre
@ 2012-09-20 21:07 ` Will Deacon
0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 21:07 UTC (permalink / raw)
To: linux-arm-kernel
Hi Nicolas,
On Thu, Sep 20, 2012 at 07:28:14PM +0100, Nicolas Pitre wrote:
> On Thu, 20 Sep 2012, Will Deacon wrote:
>
> > When updating the page protection map after calculating the user_pgprot
> > value, the base protection map is temporarily stored in an unsigned long
> > type, causing truncation of the protection bits when LPAE is enabled.
> > This effectively means that calls to mprotect() will corrupt the upper
> > page attributes, clearing the XN bit unconditionally.
> >
> > This patch uses pteval_t to store the intermediate protection values,
> > preserving the upper bits for 64-bit descriptors.
> >
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
>
> Acked-by: Nicolas Pitre <nico@linaro.org>
Thanks.
> Maybe CC to stable?
Of course.
> Are you sure there are no other occurences of this elsewhere?
> Unfortunately, STRICT_MM_TYPECHECKS can't help here.
I grepped for lines containing `pgprot' and `long' and didn't find any
other problematic code. Everything else seems to use the pgprot_t types
consistently.
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values
2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
2012-09-20 18:28 ` Nicolas Pitre
@ 2012-09-21 8:48 ` Catalin Marinas
1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2012-09-21 8:48 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 20, 2012 at 04:56:42PM +0100, Will Deacon wrote:
> When updating the page protection map after calculating the user_pgprot
> value, the base protection map is temporarily stored in an unsigned long
> type, causing truncation of the protection bits when LPAE is enabled.
> This effectively means that calls to mprotect() will corrupt the upper
> page attributes, clearing the XN bit unconditionally.
>
> This patch uses pteval_t to store the intermediate protection values,
> preserving the upper bits for 64-bit descriptors.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
2012-09-21 8:40 ` Catalin Marinas
2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
To: linux-arm-kernel
The simplified access permissions model is not used for the classic MMU
translation regime, so ensure that it is turned off in the sctlr prior
to turning on address translation for ARMv7.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/mm/proc-v7-2level.S | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index fd045e7..e37600b 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -161,11 +161,11 @@ ENDPROC(cpu_v7_set_pte_ext)
* TFR EV X F I D LR S
* .EEE ..EE PUI. .T.T 4RVI ZWRS BLDP WCAM
* rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
- * 1 0 110 0011 1100 .111 1101 < we want
+ * 01 0 110 0011 1100 .111 1101 < we want
*/
.align 2
.type v7_crval, #object
v7_crval:
- crval clear=0x0120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
+ crval clear=0x2120c302, mmuset=0x10c03c7d, ucset=0x00c01c7c
.previous
--
1.7.4.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
2012-09-20 15:56 ` [PATCH 1/4] ARM: mm: use pteval_t to represent page protection values Will Deacon
2012-09-20 15:56 ` [PATCH 2/4] ARM: mm: don't use the access flag permissions mechanism for classic MMU Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
2012-09-20 22:21 ` Russell King - ARM Linux
2012-09-20 15:56 ` [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE Will Deacon
2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
4 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
To: linux-arm-kernel
For long-descriptor translation table formats, the ARMv7 architecture
defines the last two bits of the second- and third-level descriptors to
be:
x0b - Invalid
01b - Block (second-level), Reserved (third-level)
11b - Table (second-level), Page (third-level)
This allows us to define L_PTE_PRESENT as (3 << 0) and use this value to
create ptes directly. However, when determining whether a given pte
value is present in the low-level page table accessors, we only need to
check the least significant bit of the descriptor, allowing us to write
faulting, present entries which are required for PROT_NONE mappings.
This patch introduces L_PTE_VALID, which can be used to test whether a
pte should fault, and updates the low-level page table accessors
accordingly.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/include/asm/pgtable-2level.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 3 ++-
arch/arm/include/asm/pgtable.h | 4 +---
arch/arm/mm/proc-v7-2level.S | 2 +-
arch/arm/mm/proc-v7-3level.S | 2 +-
5 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 2317a71..c44a1ec 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -115,6 +115,7 @@
* The PTE table pointer refers to the hardware entries; the "Linux"
* entries are stored 1024 bytes below.
*/
+#define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */
#define L_PTE_PRESENT (_AT(pteval_t, 1) << 0)
#define L_PTE_YOUNG (_AT(pteval_t, 1) << 1)
#define L_PTE_FILE (_AT(pteval_t, 1) << 2) /* only when !PRESENT */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index b249035..e32311a 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -67,7 +67,8 @@
* These bits overlap with the hardware bits but the naming is preserved for
* consistency with the classic page table format.
*/
-#define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Valid */
+#define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */
+#define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Present */
#define L_PTE_FILE (_AT(pteval_t, 1) << 2) /* only when !PRESENT */
#define L_PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */
#define L_PTE_RDONLY (_AT(pteval_t, 1) << 7) /* AP[2] */
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 41dc31f..df206c1 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
#define pte_exec(pte) (!(pte_val(pte) & L_PTE_XN))
#define pte_special(pte) (0)
-#define pte_present_user(pte) \
- ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
- (L_PTE_PRESENT | L_PTE_USER))
+#define pte_present_user(pte) (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
#if __LINUX_ARM_ARCH__ < 6
static inline void __sync_icache_dcache(pte_t pteval)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index e37600b..e755e9f 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -100,7 +100,7 @@ ENTRY(cpu_v7_set_pte_ext)
orrne r3, r3, #PTE_EXT_XN
tst r1, #L_PTE_YOUNG
- tstne r1, #L_PTE_PRESENT
+ tstne r1, #L_PTE_VALID
moveq r3, #0
ARM( str r3, [r0, #2048]! )
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 8de0f1d..d23d067 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -65,7 +65,7 @@ ENDPROC(cpu_v7_switch_mm)
*/
ENTRY(cpu_v7_set_pte_ext)
#ifdef CONFIG_MMU
- tst r2, #L_PTE_PRESENT
+ tst r2, #L_PTE_VALID
beq 1f
tst r3, #1 << (55 - 32) @ L_PTE_DIRTY
orreq r2, #L_PTE_RDONLY
--
1.7.4.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
@ 2012-09-20 22:21 ` Russell King - ARM Linux
2012-09-21 9:22 ` Will Deacon
0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2012-09-20 22:21 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 20, 2012 at 04:56:44PM +0100, Will Deacon wrote:
> diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
> index 2317a71..c44a1ec 100644
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> @@ -115,6 +115,7 @@
> * The PTE table pointer refers to the hardware entries; the "Linux"
> * entries are stored 1024 bytes below.
> */
> +#define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */
> #define L_PTE_PRESENT (_AT(pteval_t, 1) << 0)
> #define L_PTE_YOUNG (_AT(pteval_t, 1) << 1)
> #define L_PTE_FILE (_AT(pteval_t, 1) << 2) /* only when !PRESENT */
> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
> index b249035..e32311a 100644
> --- a/arch/arm/include/asm/pgtable-3level.h
> +++ b/arch/arm/include/asm/pgtable-3level.h
> @@ -67,7 +67,8 @@
> * These bits overlap with the hardware bits but the naming is preserved for
> * consistency with the classic page table format.
> */
> -#define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Valid */
> +#define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */
> +#define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Present */
> #define L_PTE_FILE (_AT(pteval_t, 1) << 2) /* only when !PRESENT */
> #define L_PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */
> #define L_PTE_RDONLY (_AT(pteval_t, 1) << 7) /* AP[2] */
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 41dc31f..df206c1 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> #define pte_exec(pte) (!(pte_val(pte) & L_PTE_XN))
> #define pte_special(pte) (0)
>
> -#define pte_present_user(pte) \
> - ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
> - (L_PTE_PRESENT | L_PTE_USER))
> +#define pte_present_user(pte) (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
This causes the compiler to do the "present" test and then, only if that
evaluates true, to do the "user" test. That's not what we want, we want
the compiler to combine the two tests into one (which it should be able
to do so.)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries
2012-09-20 22:21 ` Russell King - ARM Linux
@ 2012-09-21 9:22 ` Will Deacon
0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-21 9:22 UTC (permalink / raw)
To: linux-arm-kernel
Hi Russell,
On Thu, Sep 20, 2012 at 11:21:05PM +0100, Russell King - ARM Linux wrote:
> On Thu, Sep 20, 2012 at 04:56:44PM +0100, Will Deacon wrote:
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index 41dc31f..df206c1 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -203,9 +203,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> > #define pte_exec(pte) (!(pte_val(pte) & L_PTE_XN))
> > #define pte_special(pte) (0)
> >
> > -#define pte_present_user(pte) \
> > - ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
> > - (L_PTE_PRESENT | L_PTE_USER))
> > +#define pte_present_user(pte) (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
>
> This causes the compiler to do the "present" test and then, only if that
> evaluates true, to do the "user" test. That's not what we want, we want
> the compiler to combine the two tests into one (which it should be able
> to do so.)
This is tricky for LPAE because PTEs with the following suffixes must all be
treated as present:
...1xxxx01 /* L_PTE_USER | L_PTE_VALID -- will be used by hugetlb */
...1xxxx10 /* L_PTE_USER | L_PTE_PRESENT[1] */
...1xxxx11 /* L_PTE_USER | L_PTE_PRESENT */
so you need to check that (a) L_PTE_USER is set and (b) that at least one
of the bottom two bits is set. How about something like:
((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) > L_PTE_USER)
Will
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
` (2 preceding siblings ...)
2012-09-20 15:56 ` [PATCH 3/4] ARM: mm: introduce L_PTE_VALID for page table entries Will Deacon
@ 2012-09-20 15:56 ` Will Deacon
2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-20 15:56 UTC (permalink / raw)
To: linux-arm-kernel
PROT_NONE mappings apply the page protection attributes defined by _P000
which translate to PAGE_NONE for ARM. These attributes specify an XN,
RDONLY pte that is inaccessible to userspace. However, on kernels
configured without support for domains, such a pte *is* accessible to
the kernel and can be read via get_user, allowing tasks to read
PROT_NONE pages via syscalls such as read/write over a pipe.
This patch introduces a new software pte flag, L_PTE_NONE, that is set
to identify faulting, present entries.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/include/asm/pgtable-2level.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 1 +
arch/arm/include/asm/pgtable.h | 6 +++---
arch/arm/mm/proc-macros.S | 4 ++++
arch/arm/mm/proc-v7-2level.S | 4 ++++
arch/arm/mm/proc-v7-3level.S | 3 +++
6 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index c44a1ec..f97ee02 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -124,6 +124,7 @@
#define L_PTE_USER (_AT(pteval_t, 1) << 8)
#define L_PTE_XN (_AT(pteval_t, 1) << 9)
#define L_PTE_SHARED (_AT(pteval_t, 1) << 10) /* shared(v6), coherent(xsc3) */
+#define L_PTE_NONE (_AT(pteval_t, 1) << 11)
/*
* These are the memory types, defined to be compatible with
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index e32311a..a3f3792 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -77,6 +77,7 @@
#define L_PTE_XN (_AT(pteval_t, 1) << 54) /* XN */
#define L_PTE_DIRTY (_AT(pteval_t, 1) << 55) /* unused */
#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 56) /* unused */
+#define L_PTE_NONE (_AT(pteval_t, 1) << 57) /* PROT_NONE */
/*
* To be used in assembly code with the upper page attributes.
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index df206c1..7e21903 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -73,7 +73,7 @@ extern pgprot_t pgprot_kernel;
#define _MOD_PROT(p, b) __pgprot(pgprot_val(p) | (b))
-#define PAGE_NONE _MOD_PROT(pgprot_user, L_PTE_XN | L_PTE_RDONLY)
+#define PAGE_NONE _MOD_PROT(pgprot_user, L_PTE_XN | L_PTE_RDONLY | L_PTE_NONE)
#define PAGE_SHARED _MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_XN)
#define PAGE_SHARED_EXEC _MOD_PROT(pgprot_user, L_PTE_USER)
#define PAGE_COPY _MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY | L_PTE_XN)
@@ -83,7 +83,7 @@ extern pgprot_t pgprot_kernel;
#define PAGE_KERNEL _MOD_PROT(pgprot_kernel, L_PTE_XN)
#define PAGE_KERNEL_EXEC pgprot_kernel
-#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN)
+#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
#define __PAGE_SHARED __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
#define __PAGE_SHARED_EXEC __pgprot(_L_PTE_DEFAULT | L_PTE_USER)
#define __PAGE_COPY __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_RDONLY | L_PTE_XN)
@@ -240,7 +240,7 @@ static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
- const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER;
+ const pteval_t mask = L_PTE_XN | L_PTE_RDONLY | L_PTE_USER | L_PTE_NONE;
pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
return pte;
}
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 2d8ff3a..58b201c 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -167,6 +167,10 @@
tst r1, #L_PTE_YOUNG
tstne r1, #L_PTE_PRESENT
moveq r3, #0
+#ifndef CONFIG_CPU_USE_DOMAINS
+ tstne r1, #L_PTE_NONE
+ movne r3, #0
+#endif
str r3, [r0]
mcr p15, 0, r0, c7, c10, 1 @ flush_pte
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index e755e9f..6d98c13 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -101,6 +101,10 @@ ENTRY(cpu_v7_set_pte_ext)
tst r1, #L_PTE_YOUNG
tstne r1, #L_PTE_VALID
+#ifndef CONFIG_CPU_USE_DOMAINS
+ eorne r1, r1, #L_PTE_NONE
+ tstne r1, #L_PTE_NONE
+#endif
moveq r3, #0
ARM( str r3, [r0, #2048]! )
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index d23d067..7b56386 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -67,6 +67,9 @@ ENTRY(cpu_v7_set_pte_ext)
#ifdef CONFIG_MMU
tst r2, #L_PTE_VALID
beq 1f
+ tst r3, #1 << (57 - 32) @ L_PTE_NONE
+ bicne r2, #L_PTE_VALID
+ bne 1f
tst r3, #1 << (55 - 32) @ L_PTE_DIRTY
orreq r2, #L_PTE_RDONLY
1: strd r2, r3, [r0]
--
1.7.4.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
2012-09-20 15:56 [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Will Deacon
` (3 preceding siblings ...)
2012-09-20 15:56 ` [PATCH 4/4] ARM: mm: introduce present, faulting entries for PAGE_NONE Will Deacon
@ 2012-09-20 22:12 ` Russell King - ARM Linux
2012-09-20 22:23 ` Benjamin Herrenschmidt
2012-09-21 9:32 ` Will Deacon
4 siblings, 2 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2012-09-20 22:12 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 20, 2012 at 04:56:41PM +0100, Will Deacon wrote:
> Hello,
>
> After laughing at Ben H during LPC when it emerged that the PPC PROT_NONE
> protection bits don't prevent kernel access to the protected pages, I
> looked at the ARM code and, to my dismay, found that we have the same
> problem when not using domains.
>
> This patch series addresses the issue with the following points worth
> noting:
>
> - We use the last available software bit (11) for 2-level PTEs.
> Whilst this is somewhat of a pity, I can't think of a better
> reason to allocate it than to fix an outstanding bug.
But you don't - you've defined it to be bit 0, which is the same as the
present bit, and clearing the present bit is not a good idea as it turns
the entry into something like a swap entry.
So, let's go back and restart the analysis (and document it). What
happens is we call pte_modify() on each present page, setting the new
protection - __PAGE_NONE.
__PAGE_NONE is defined as: _L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN
which translates to: present | young | read only | xn - with no user
bit set. No user bit set means that userspace accesses, get_user()
and put_user() must fail when the page is accessed.
Now, this results in the pages being present (and really are present
in userspace) but with no accessibility allowed for user-mode accesses,
and read-only for kernel-mode accesses. (The permissions valules
for ARMv7 are: APX=1 AP1=0 AP0=1).
Now, this happens fine when DOMAINS are enabled, because we use the
T-bit operations to ensure that the accesses are checked against
userspace permissions.
However, when DOMAINS are disabled, we use standard loads/stores, which
check against kernel permissions, and so _any_ PROT_NONE mmap() will be
readable by any of the kernel user access functions.
So, on ARMv5 and below, these bits get translated as...
L_PTE_RDONLY L_PTE_USER AP
1 0 00 (user n/a, system r/o)
0 0 01 (user n/a, system r/w)
1 1 10 (user r/o, system r/w)
0 1 11 (user r/w, system r/w)
and this is fine because we always use domains (we must _never_ not use
domains here, because the available permissions that the architecture
gives us do not allow it.)
On ARMv6+ with domains:
L_PTE_RDONLY L_PTE_USER AP
1 0 101 (user n/a, system r/o)
0 0 001 (user n/a, system r/w)
1 1 010 (user r/o, system r/w)
0 1 011 (user r/w, system r/w)
which is also fine, as user accesses are performed using T-bit loads and
stores.
On ARMv6+ without domains, only one of these changes:
L_PTE_RDONLY L_PTE_USER AP
1 0 101 (user n/a, system r/o)
0 0 001 (user n/a, system r/w)
1 1 111 (user r/o, system r/o)
0 1 011 (user r/w, system r/w)
However, now we perform user accesses using kernel-mode loads/stores,
so the access permissions which are checked are the 'system'
permissions, and as we can see from the above, when there is no state
when the kernel is denied access to the page.
Why not make system access permissions reflect user access permissions?
Well, that's easy to answer - you want your kernel data and code to be
protected against reading/writing by userspace... so we need a
combination of permissions to permit those encodings - and not
surprisingly, clearing the L_PTE_USER bit gives us that.
What wasn't thought about properly was the effect of removing domains
and converting the kernel-initiated userspace accesses to use kernel-mode
accesses. Yes, the obvious was changed (so that user r/o also became
kernel r/o) but that's not enough.
The only solution I can see is to have L_PTE_KERNEL which indicates
whether this is a kernel mapping. If both L_PTE_KERNEL and L_PTE_USER
are clear, then we zero out the hardware PTE entry - and that would be
the case for a userspace PROT_NONE entry. Otherwise, we fall back to
our current behaviour.
So, essentially we change from ignoring the PTE value when
(val & (PRESENT | YOUNG)) != (PRESENT | YOUNG))
to:
(val & (PRESENT | YOUNG | USER)) != (PRESENT | YOUNG | USER)) &&
(val & (PRESENT | YOUNG | KERNEL)) != (PRESENT | YOUNG | KERNEL))
That'll make the set_pte assembly more horrid, but I guess that's the
price you pay for removing useful features which architecture folk don't
like from CPUs... and it gets more horrid because you can't encode some
of those bit patterns with the standard 8-bit and shift opcode
representation.
Really fun bug, but it needs more thought about how to solve it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
@ 2012-09-20 22:23 ` Benjamin Herrenschmidt
2012-09-21 9:32 ` Will Deacon
1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-20 22:23 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 2012-09-20 at 23:12 +0100, Russell King - ARM Linux wrote:
.../...
> That'll make the set_pte assembly more horrid, but I guess that's the
> price you pay for removing useful features which architecture folk don't
> like from CPUs... and it gets more horrid because you can't encode some
> of those bit patterns with the standard 8-bit and shift opcode
> representation.
>
> Really fun bug, but it needs more thought about how to solve it.
Yeah, fun isn't it ? :-)
The x86 way to solve it is simple, though ugly ... They use a SW bit
called "PROTNONE". A PROT_NONE page gets that instead of the HW valid
bit and pte_present() test if any of them is set.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS
2012-09-20 22:12 ` [PATCH 0/4] Fix PROT_NONE page permissions when !CPU_USE_DOMAINS Russell King - ARM Linux
2012-09-20 22:23 ` Benjamin Herrenschmidt
@ 2012-09-21 9:32 ` Will Deacon
1 sibling, 0 replies; 14+ messages in thread
From: Will Deacon @ 2012-09-21 9:32 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 20, 2012 at 11:12:15PM +0100, Russell King - ARM Linux wrote:
> On Thu, Sep 20, 2012 at 04:56:41PM +0100, Will Deacon wrote:
> > Hello,
> >
> > After laughing at Ben H during LPC when it emerged that the PPC PROT_NONE
> > protection bits don't prevent kernel access to the protected pages, I
> > looked at the ARM code and, to my dismay, found that we have the same
> > problem when not using domains.
> >
> > This patch series addresses the issue with the following points worth
> > noting:
> >
> > - We use the last available software bit (11) for 2-level PTEs.
> > Whilst this is somewhat of a pity, I can't think of a better
> > reason to allocate it than to fix an outstanding bug.
>
> But you don't - you've defined it to be bit 0, which is the same as the
> present bit, and clearing the present bit is not a good idea as it turns
> the entry into something like a swap entry.
Sorry, my patches may have been in a slightly confusing order. L_PTE_VALID
is a red herring, required simply to allow present, faulting entries. The
new software bit is L_PTE_NONE (I can rename this to L_PTE_KERNEL if you
prefer) and is introduced in patch 4/4.
> So, let's go back and restart the analysis (and document it). What
> happens is we call pte_modify() on each present page, setting the new
> protection - __PAGE_NONE.
Thanks for this in-depth analysis, you've hit the problem on the head and
it's good to have that archived.
> On ARMv6+ without domains, only one of these changes:
>
> L_PTE_RDONLY L_PTE_USER AP
> 1 0 101 (user n/a, system r/o)
> 0 0 001 (user n/a, system r/w)
> 1 1 111 (user r/o, system r/o)
> 0 1 011 (user r/w, system r/w)
>
> However, now we perform user accesses using kernel-mode loads/stores,
> so the access permissions which are checked are the 'system'
> permissions, and as we can see from the above, when there is no state
> when the kernel is denied access to the page.
>
> Why not make system access permissions reflect user access permissions?
> Well, that's easy to answer - you want your kernel data and code to be
> protected against reading/writing by userspace... so we need a
> combination of permissions to permit those encodings - and not
> surprisingly, clearing the L_PTE_USER bit gives us that.
>
> What wasn't thought about properly was the effect of removing domains
> and converting the kernel-initiated userspace accesses to use kernel-mode
> accesses. Yes, the obvious was changed (so that user r/o also became
> kernel r/o) but that's not enough.
Agreed.
> The only solution I can see is to have L_PTE_KERNEL which indicates
> whether this is a kernel mapping. If both L_PTE_KERNEL and L_PTE_USER
> are clear, then we zero out the hardware PTE entry - and that would be
> the case for a userspace PROT_NONE entry. Otherwise, we fall back to
> our current behaviour.
>
> So, essentially we change from ignoring the PTE value when
>
> (val & (PRESENT | YOUNG)) != (PRESENT | YOUNG))
>
> to:
>
> (val & (PRESENT | YOUNG | USER)) != (PRESENT | YOUNG | USER)) &&
> (val & (PRESENT | YOUNG | KERNEL)) != (PRESENT | YOUNG | KERNEL))
This is basically what I've done in the final patch, just with a different
name for the bit. The reason I added the VALID stuff before is because
the software bits overlap the hardware bits for LPAE, so you can't just
write #0 to the pte if you want it be treated as present. Instead, I make
use of L_PTE_PRESENT being 2 bits wide and just clear the bottom bit (which
causes the hardware to fault on access). For 2 levels, L_PTE_VALID ==
L_PTE_PRESENT so everything is as before.
You *could* do this slightly differently and check for L_PTE_NONE in
pte_present, but this doesn't help us for section mappings in userspace,
which I have working with hugetlb (I'll post this soon).
> That'll make the set_pte assembly more horrid, but I guess that's the
> price you pay for removing useful features which architecture folk don't
> like from CPUs... and it gets more horrid because you can't encode some
> of those bit patterns with the standard 8-bit and shift opcode
> representation.
It's not too bad. PROT_NONE pages are pretty rare, so you can just add the
check after the YOUNG and PRESENT checks with the appropriate condition
flags.
Cheers,
Will
^ permalink raw reply [flat|nested] 14+ messages in thread