* [PATCH v1 RFC Zisslpcfi 05/20] mmap : Introducing new protection "PROT_SHADOWSTACK" for mmap
[not found] <20230213045351.3945824-1-debug@rivosinc.com>
@ 2023-02-13 4:53 ` Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 07/20] elf: ELF header parsing in GNU property for cfi state Deepak Gupta
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 4:53 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Arnd Bergmann, Andrew Morton,
Paul Walmsley, Palmer Dabbelt, Albert Ou
Cc: Deepak Gupta, linux-arch, linux-mm
Major architectures (x86, arm, riscv) have introduced shadow
stack support in their architecture for return control flow integrity
ISA extensions have some special encodings to make sure this shadow stack
page has special property in page table i.e a readonly page but still
writeable under special scenarios. As an example x86 has `call` (or new
shadow stack instructions) which can perform store on shadow stack but
regular stores are disallowed. Similarly riscv has sspush & ssamoswap
instruction which can perform stores but regular stores are not allowed.
As evident a page which can only be writeable by certain special
instructions but otherwise appear readonly to regular stores need a new
protection flag.
This patch introduces a new mmap protection flag to indicate such
protection in generic manner. Architectures can implement such protection
using arch specific encodings in page tables.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
include/uapi/asm-generic/mman-common.h | 6 ++++++
mm/mmap.c | 4 ++++
2 files changed, 10 insertions(+)
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 6ce1f1ceb432..c8e549b29a24 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -11,6 +11,12 @@
#define PROT_WRITE 0x2 /* page can be written */
#define PROT_EXEC 0x4 /* page can be executed */
#define PROT_SEM 0x8 /* page may be used for atomic ops */
+/*
+ * Major architectures (x86, aarch64, riscv) have shadow stack now. Each architecture can
+ * choose to implement different PTE encodings. x86 encodings are PTE.R=0, PTE.W=1, PTE.D=1
+ * riscv encodings are PTE.R=0, PTE.W=1. Aarch64 encodings are not published yet
+ */
+#define PROT_SHADOWSTACK 0x40
/* 0x10 reserved for arch-specific use */
/* 0x20 reserved for arch-specific use */
#define PROT_NONE 0x0 /* page can not be accessed */
diff --git a/mm/mmap.c b/mm/mmap.c
index 425a9349e610..7e877c93d711 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -46,6 +46,7 @@
#include <linux/pkeys.h>
#include <linux/oom.h>
#include <linux/sched/mm.h>
+#include <linux/processor.h>
#include <linux/uaccess.h>
#include <asm/cacheflush.h>
@@ -1251,6 +1252,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
if (!len)
return -EINVAL;
+ /* If PROT_SHADOWSTACK is specified and arch doesn't support it, return -EINVAL */
+ if ((prot & PROT_SHADOWSTACK) && !arch_supports_shadow_stack())
+ return -EINVAL;
/*
* Does the application expect PROT_READ to imply PROT_EXEC?
*
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 RFC Zisslpcfi 07/20] elf: ELF header parsing in GNU property for cfi state
[not found] <20230213045351.3945824-1-debug@rivosinc.com>
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 05/20] mmap : Introducing new protection "PROT_SHADOWSTACK" for mmap Deepak Gupta
@ 2023-02-13 4:53 ` Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 08/20] riscv: ELF header parsing in GNU property for riscv zisslpcfi Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs Deepak Gupta
3 siblings, 0 replies; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 4:53 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Alexander Viro, Eric Biederman,
Kees Cook, Paul Walmsley, Palmer Dabbelt, Albert Ou
Cc: Deepak Gupta, linux-fsdevel, linux-mm
Binaries enabled with support for control-flow integrity will have new
instructions that may fault on cpus which dont implement cfi mechanisms.
This change adds
- stub for setting up cfi state when loading a binary. Architecture
specific implementation can choose to implement this stub and setup
cfi state for program.
- define riscv ELF flag marker for forward cfi and backward cfi in
uapi/linux/elf.h
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
fs/binfmt_elf.c | 5 +++++
include/linux/elf.h | 8 ++++++++
include/uapi/linux/elf.h | 6 ++++++
3 files changed, 19 insertions(+)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 9a780fafc539..bb431052eb01 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1277,6 +1277,11 @@ static int load_elf_binary(struct linux_binprm *bprm)
set_binfmt(&elf_format);
+#if defined(CONFIG_USER_SHADOW_STACK) || defined(CONFIG_USER_INDIRECT_BR_LP)
+ retval = arch_elf_setup_cfi_state(&arch_state);
+ if (retval < 0)
+ goto out;
+#endif
#ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES
retval = ARCH_SETUP_ADDITIONAL_PAGES(bprm, elf_ex, !!interpreter);
if (retval < 0)
diff --git a/include/linux/elf.h b/include/linux/elf.h
index c9a46c4e183b..106d28f065aa 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -109,4 +109,12 @@ static inline int arch_elf_adjust_prot(int prot,
}
#endif
+#if defined(CONFIG_USER_SHADOW_STACK) || defined(CONFIG_USER_INDIRECT_BR_LP)
+extern int arch_elf_setup_cfi_state(const struct arch_elf_state *state);
+#else
+static inline int arch_elf_setup_cfi_state(const struct arch_elf_state *state)
+{
+ return 0;
+}
+#endif
#endif /* _LINUX_ELF_H */
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 4c6a8fa5e7ed..1cbd332061dc 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -468,4 +468,10 @@ typedef struct elf64_note {
/* Bits for GNU_PROPERTY_AARCH64_FEATURE_1_BTI */
#define GNU_PROPERTY_AARCH64_FEATURE_1_BTI (1U << 0)
+/* .note.gnu.property types for RISCV: */
+/* Bits for GNU_PROPERTY_RISCV_FEATURE_1_FCFI/BCFI */
+#define GNU_PROPERTY_RISCV_FEATURE_1_AND 0xc0000000
+#define GNU_PROPERTY_RISCV_FEATURE_1_FCFI (1u << 0)
+#define GNU_PROPERTY_RISCV_FEATURE_1_BCFI (1u << 1)
+
#endif /* _UAPI_LINUX_ELF_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 RFC Zisslpcfi 08/20] riscv: ELF header parsing in GNU property for riscv zisslpcfi
[not found] <20230213045351.3945824-1-debug@rivosinc.com>
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 05/20] mmap : Introducing new protection "PROT_SHADOWSTACK" for mmap Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 07/20] elf: ELF header parsing in GNU property for cfi state Deepak Gupta
@ 2023-02-13 4:53 ` Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs Deepak Gupta
3 siblings, 0 replies; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 4:53 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Eric Biederman, Kees Cook
Cc: Deepak Gupta, linux-mm
Binaries enabled for Zisslpcfi will have new instructions that may fault
on risc-v cpus which dont implement Zimops or Zicfi. This change adds
- support for parsing new backward and forward cfi flags in
PT_GNU_PROPERTY
- setting cfi state on recognizing cfi flags in ELF
- enable back cfi and forward cfi in sstatus
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
arch/riscv/include/asm/elf.h | 54 +++++++++++++++++++++++++++++
arch/riscv/kernel/process.c | 67 ++++++++++++++++++++++++++++++++++++
2 files changed, 121 insertions(+)
diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index e7acffdf21d2..60ac2d2390ee 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -14,6 +14,7 @@
#include <asm/auxvec.h>
#include <asm/byteorder.h>
#include <asm/cacheinfo.h>
+#include <linux/processor.h>
/*
* These are used to set parameters in the core dumps.
@@ -140,4 +141,57 @@ extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
compat_arch_setup_additional_pages
#endif /* CONFIG_COMPAT */
+
+#define RISCV_ELF_FCFI (1 << 0)
+#define RISCV_ELF_BCFI (1 << 1)
+
+#ifdef CONFIG_ARCH_BINFMT_ELF_STATE
+struct arch_elf_state {
+ int flags;
+};
+
+#define INIT_ARCH_ELF_STATE { \
+ .flags = 0, \
+}
+#endif
+
+#ifdef CONFIG_ARCH_USE_GNU_PROPERTY
+static inline int arch_parse_elf_property(u32 type, const void *data,
+ size_t datasz, bool compat,
+ struct arch_elf_state *arch)
+{
+ /*
+ * TODO: Do we want to support in 32bit/compat?
+ * may be return 0 for now.
+ */
+ if (IS_ENABLED(CONFIG_COMPAT) && compat)
+ return 0;
+ if ((type & GNU_PROPERTY_RISCV_FEATURE_1_AND) == GNU_PROPERTY_RISCV_FEATURE_1_AND) {
+ const u32 *p = data;
+
+ if (datasz != sizeof(*p))
+ return -ENOEXEC;
+ if (arch_supports_indirect_br_lp_instr() &&
+ (*p & GNU_PROPERTY_RISCV_FEATURE_1_FCFI))
+ arch->flags |= RISCV_ELF_FCFI;
+ if (arch_supports_shadow_stack() && (*p & GNU_PROPERTY_RISCV_FEATURE_1_BCFI))
+ arch->flags |= RISCV_ELF_BCFI;
+ }
+ return 0;
+}
+
+static inline int arch_elf_pt_proc(void *ehdr, void *phdr,
+ struct file *f, bool is_interp,
+ struct arch_elf_state *state)
+{
+ return 0;
+}
+
+static inline int arch_check_elf(void *ehdr, bool has_interp,
+ void *interp_ehdr,
+ struct arch_elf_state *state)
+{
+ return 0;
+}
+#endif
#endif /* _ASM_RISCV_ELF_H */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 8955f2432c2d..db676262e61e 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -24,6 +24,7 @@
#include <asm/switch_to.h>
#include <asm/thread_info.h>
#include <asm/cpuidle.h>
+#include <linux/mman.h>
register unsigned long gp_in_global __asm__("gp");
@@ -135,6 +136,14 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
else
regs->status |= SR_UXL_64;
#endif
+#ifdef CONFIG_USER_SHADOW_STACK
+ if (current_thread_info()->user_cfi_state.ufcfi_en)
+ regs->status |= SR_UFCFIEN;
+#endif
+#ifdef CONFIG_USER_INDIRECT_BR_LP
+ if (current_thread_info()->user_cfi_state.ubcfi_en)
+ regs->status |= SR_UBCFIEN;
+#endif
}
void flush_thread(void)
@@ -189,3 +198,61 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.sp = (unsigned long)childregs; /* kernel sp */
return 0;
}
+
+
+int allocate_shadow_stack(unsigned long *shadow_stack_base, unsigned long *shdw_size)
+{
+ int flags = MAP_ANONYMOUS | MAP_PRIVATE;
+ struct mm_struct *mm = current->mm;
+ unsigned long addr, populate, size;
+ *shadow_stack = 0;
+
+ if (!shdw_size)
+ return -EINVAL;
+
+ size = *shdw_size;
+
+ /* If size is 0, then try to calculate yourself */
+ if (size == 0)
+ size = round_up(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G), PAGE_SIZE);
+ mmap_write_lock(mm);
+ addr = do_mmap(NULL, 0, size, PROT_SHADOWSTACK, flags, 0,
+ &populate, NULL);
+ mmap_write_unlock(mm);
+ if (IS_ERR_VALUE(addr))
+ return PTR_ERR((void *)addr);
+ *shadow_stack_base = addr;
+ *shdw_size = size;
+ return 0;
+}
+
+#if defined(CONFIG_USER_SHADOW_STACK) || defined(CONFIG_USER_INDIRECT_BR_LP)
+/* gets called from load_elf_binary(). This'll setup shadow stack and forward cfi enable */
+int arch_elf_setup_cfi_state(const struct arch_elf_state *state)
+{
+ int ret = 0;
+ unsigned long shadow_stack_base = 0;
+ unsigned long shadow_stk_size = 0;
+ struct thread_info *info = NULL;
+
+ info = current_thread_info();
+ /* setup back cfi state */
+ /* setup cfi state only if implementation supports it */
+ if (arch_supports_shadow_stack() && (state->flags & RISCV_ELF_BCFI)) {
+ info->user_cfi_state.ubcfi_en = 1;
+ ret = allocate_shadow_stack(&shadow_stack_base, &shadow_stk_size);
+ if (ret)
+ return ret;
+
+ info->user_cfi_state.user_shdw_stk = (shadow_stack_base + shadow_stk_size);
+ info->user_cfi_state.shdw_stk_base = shadow_stack_base;
+ }
+ /* setup forward cfi state */
+ if (arch_supports_indirect_br_lp_instr() && (state->flags & RISCV_ELF_FCFI)) {
+ info->user_cfi_state.ufcfi_en = 1;
+ info->user_cfi_state.lp_label = 0;
+ }
+
+ return ret;
+}
+#endif
\ No newline at end of file
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
[not found] <20230213045351.3945824-1-debug@rivosinc.com>
` (2 preceding siblings ...)
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 08/20] riscv: ELF header parsing in GNU property for riscv zisslpcfi Deepak Gupta
@ 2023-02-13 4:53 ` Deepak Gupta
2023-02-13 12:05 ` David Hildenbrand
3 siblings, 1 reply; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 4:53 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Andrew Morton; +Cc: Deepak Gupta, linux-mm
maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
able memory except it can only be written by certain specific
instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
if vma is shadow stack VMA. Each arch can define which combination of VMA
flags means a shadow stack.
Additionally pte_mkshdwstk must be provided by arch specific PTE
construction headers to create shadow stack PTEs. (in arch specific
pgtable.h).
This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
is not selected.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
include/linux/mm.h | 23 +++++++++++++++++++++--
include/linux/pgtable.h | 4 ++++
2 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8f857163ac89..a7705bc49bfe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
void free_compound_page(struct page *page);
#ifdef CONFIG_MMU
+
+#ifdef CONFIG_USER_SHADOW_STACK
+bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
+#endif
+
+static inline bool
+is_shadow_stack_vma(struct vm_area_struct *vma)
+{
+#ifdef CONFIG_USER_SHADOW_STACK
+ return arch_is_shadow_stack_vma(vma);
+#else
+ return false;
+#endif
+}
+
/*
* Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
* servicing faults for write access. In the normal case, do always want
@@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
*/
static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
{
- if (likely(vma->vm_flags & VM_WRITE))
- pte = pte_mkwrite(pte);
+ if (likely(vma->vm_flags & VM_WRITE)) {
+ if (unlikely(is_shadow_stack_vma(vma)))
+ pte = pte_mkshdwstk(pte);
+ else
+ pte = pte_mkwrite(pte);
+ }
return pte;
}
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 1159b25b0542..94b157218c73 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1736,4 +1736,8 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags) \
} \
EXPORT_SYMBOL(vm_get_page_prot);
+#ifndef CONFIG_USER_SHADOW_STACK
+#define pte_mkshdwstk(pte) pte
+#endif
+
#endif /* _LINUX_PGTABLE_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs Deepak Gupta
@ 2023-02-13 12:05 ` David Hildenbrand
2023-02-13 14:37 ` Deepak Gupta
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2023-02-13 12:05 UTC (permalink / raw)
To: Deepak Gupta, linux-kernel, linux-riscv, Andrew Morton; +Cc: linux-mm
On 13.02.23 05:53, Deepak Gupta wrote:
> maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
> VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
> able memory except it can only be written by certain specific
> instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
> if vma is shadow stack VMA. Each arch can define which combination of VMA
> flags means a shadow stack.
>
> Additionally pte_mkshdwstk must be provided by arch specific PTE
> construction headers to create shadow stack PTEs. (in arch specific
> pgtable.h).
>
> This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
> is not selected.
>
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
> include/linux/mm.h | 23 +++++++++++++++++++++--
> include/linux/pgtable.h | 4 ++++
> 2 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 8f857163ac89..a7705bc49bfe 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
> void free_compound_page(struct page *page);
>
> #ifdef CONFIG_MMU
> +
> +#ifdef CONFIG_USER_SHADOW_STACK
> +bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
> +#endif
> +
> +static inline bool
> +is_shadow_stack_vma(struct vm_area_struct *vma)
> +{
> +#ifdef CONFIG_USER_SHADOW_STACK
> + return arch_is_shadow_stack_vma(vma);
> +#else
> + return false;
> +#endif
> +}
> +
> /*
> * Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
> * servicing faults for write access. In the normal case, do always want
> @@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
> */
> static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
> {
> - if (likely(vma->vm_flags & VM_WRITE))
> - pte = pte_mkwrite(pte);
> + if (likely(vma->vm_flags & VM_WRITE)) {
> + if (unlikely(is_shadow_stack_vma(vma)))
> + pte = pte_mkshdwstk(pte);
> + else
> + pte = pte_mkwrite(pte);
> + }
> return pte;
Exactly what we are trying to avoid in the x86 approach right now.
Please see the x86 series on details, we shouldn't try reinventing the
wheel but finding a core-mm approach that fits multiple architectures.
https://lkml.kernel.org/r/20230119212317.8324-1-rick.p.edgecombe@intel.com
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-13 12:05 ` David Hildenbrand
@ 2023-02-13 14:37 ` Deepak Gupta
2023-02-13 14:56 ` David Hildenbrand
0 siblings, 1 reply; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 14:37 UTC (permalink / raw)
To: David Hildenbrand; +Cc: linux-kernel, linux-riscv, Andrew Morton, linux-mm
On Mon, Feb 13, 2023 at 01:05:16PM +0100, David Hildenbrand wrote:
>On 13.02.23 05:53, Deepak Gupta wrote:
>>maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
>>VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
>>able memory except it can only be written by certain specific
>>instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
>>if vma is shadow stack VMA. Each arch can define which combination of VMA
>>flags means a shadow stack.
>>
>>Additionally pte_mkshdwstk must be provided by arch specific PTE
>>construction headers to create shadow stack PTEs. (in arch specific
>>pgtable.h).
>>
>>This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
>>is not selected.
>>
>>Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>>---
>> include/linux/mm.h | 23 +++++++++++++++++++++--
>> include/linux/pgtable.h | 4 ++++
>> 2 files changed, 25 insertions(+), 2 deletions(-)
>>
>>diff --git a/include/linux/mm.h b/include/linux/mm.h
>>index 8f857163ac89..a7705bc49bfe 100644
>>--- a/include/linux/mm.h
>>+++ b/include/linux/mm.h
>>@@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
>> void free_compound_page(struct page *page);
>> #ifdef CONFIG_MMU
>>+
>>+#ifdef CONFIG_USER_SHADOW_STACK
>>+bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
>>+#endif
>>+
>>+static inline bool
>>+is_shadow_stack_vma(struct vm_area_struct *vma)
>>+{
>>+#ifdef CONFIG_USER_SHADOW_STACK
>>+ return arch_is_shadow_stack_vma(vma);
>>+#else
>>+ return false;
>>+#endif
>>+}
>>+
>> /*
>> * Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
>> * servicing faults for write access. In the normal case, do always want
>>@@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
>> */
>> static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>> {
>>- if (likely(vma->vm_flags & VM_WRITE))
>>- pte = pte_mkwrite(pte);
>>+ if (likely(vma->vm_flags & VM_WRITE)) {
>>+ if (unlikely(is_shadow_stack_vma(vma)))
>>+ pte = pte_mkshdwstk(pte);
>>+ else
>>+ pte = pte_mkwrite(pte);
>>+ }
>> return pte;
>
>Exactly what we are trying to avoid in the x86 approach right now.
>Please see the x86 series on details, we shouldn't try reinventing the
>wheel but finding a core-mm approach that fits multiple architectures.
>
>https://lkml.kernel.org/r/20230119212317.8324-1-rick.p.edgecombe@intel.com
Thanks David for comment here. I looked at x86 approach. This patch
actually written in a way which is not re-inventing wheel and is following
a core-mm approach that fits multiple architectures.
Change above checks `is_shadow_stack_vma` and if it returns true then only
it manufactures shadow stack pte else it'll make a regular writeable mapping.
Now if we look at `is_shadow_stack_vma` implementation, it returns false if
`CONFIG_USER_SHADOW_STACK` is not defined. If `CONFIG_USER_SHADOW_STACK is
defined then it calls `arch_is_shadow_stack_vma` which should be implemented
by arch specific code. This allows each architecture to define their own vma
flag encodings for shadow stack (riscv chooses presence of only `VM_WRITE`
which is analogous to choosen PTE encodings on riscv W=1,R=0,X=0)
Additionally pte_mkshdwstk will be nop if not implemented by architecture.
Let me know if this make sense. If I am missing something here, let me know.
>
>--
>Thanks,
>
>David / dhildenb
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-13 14:37 ` Deepak Gupta
@ 2023-02-13 14:56 ` David Hildenbrand
2023-02-13 20:01 ` Deepak Gupta
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2023-02-13 14:56 UTC (permalink / raw)
To: Deepak Gupta; +Cc: linux-kernel, linux-riscv, Andrew Morton, linux-mm
On 13.02.23 15:37, Deepak Gupta wrote:
> On Mon, Feb 13, 2023 at 01:05:16PM +0100, David Hildenbrand wrote:
>> On 13.02.23 05:53, Deepak Gupta wrote:
>>> maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
>>> VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
>>> able memory except it can only be written by certain specific
>>> instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
>>> if vma is shadow stack VMA. Each arch can define which combination of VMA
>>> flags means a shadow stack.
>>>
>>> Additionally pte_mkshdwstk must be provided by arch specific PTE
>>> construction headers to create shadow stack PTEs. (in arch specific
>>> pgtable.h).
>>>
>>> This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
>>> is not selected.
>>>
>>> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>>> ---
>>> include/linux/mm.h | 23 +++++++++++++++++++++--
>>> include/linux/pgtable.h | 4 ++++
>>> 2 files changed, 25 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 8f857163ac89..a7705bc49bfe 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
>>> void free_compound_page(struct page *page);
>>> #ifdef CONFIG_MMU
>>> +
>>> +#ifdef CONFIG_USER_SHADOW_STACK
>>> +bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
>>> +#endif
>>> +
>>> +static inline bool
>>> +is_shadow_stack_vma(struct vm_area_struct *vma)
>>> +{
>>> +#ifdef CONFIG_USER_SHADOW_STACK
>>> + return arch_is_shadow_stack_vma(vma);
>>> +#else
>>> + return false;
>>> +#endif
>>> +}
>>> +
>>> /*
>>> * Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
>>> * servicing faults for write access. In the normal case, do always want
>>> @@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
>>> */
>>> static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>>> {
>>> - if (likely(vma->vm_flags & VM_WRITE))
>>> - pte = pte_mkwrite(pte);
>>> + if (likely(vma->vm_flags & VM_WRITE)) {
>>> + if (unlikely(is_shadow_stack_vma(vma)))
>>> + pte = pte_mkshdwstk(pte);
>>> + else
>>> + pte = pte_mkwrite(pte);
>>> + }
>>> return pte;
>>
>> Exactly what we are trying to avoid in the x86 approach right now.
>> Please see the x86 series on details, we shouldn't try reinventing the
>> wheel but finding a core-mm approach that fits multiple architectures.
>>
>> https://lkml.kernel.org/r/20230119212317.8324-1-rick.p.edgecombe@intel.com
>
> Thanks David for comment here. I looked at x86 approach. This patch
> actually written in a way which is not re-inventing wheel and is following
> a core-mm approach that fits multiple architectures.
>
> Change above checks `is_shadow_stack_vma` and if it returns true then only
> it manufactures shadow stack pte else it'll make a regular writeable mapping.
>
> Now if we look at `is_shadow_stack_vma` implementation, it returns false if
> `CONFIG_USER_SHADOW_STACK` is not defined. If `CONFIG_USER_SHADOW_STACK is
> defined then it calls `arch_is_shadow_stack_vma` which should be implemented
> by arch specific code. This allows each architecture to define their own vma
> flag encodings for shadow stack (riscv chooses presence of only `VM_WRITE`
> which is analogous to choosen PTE encodings on riscv W=1,R=0,X=0)
>
> Additionally pte_mkshdwstk will be nop if not implemented by architecture.
>
> Let me know if this make sense. If I am missing something here, let me know.
See the discussion in that thread. The idea is to pass a VMA to
pte_mkwrite() and let it handle how to actually set it writable.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-13 14:56 ` David Hildenbrand
@ 2023-02-13 20:01 ` Deepak Gupta
2023-02-14 12:10 ` David Hildenbrand
0 siblings, 1 reply; 10+ messages in thread
From: Deepak Gupta @ 2023-02-13 20:01 UTC (permalink / raw)
To: David Hildenbrand; +Cc: linux-kernel, linux-riscv, Andrew Morton, linux-mm
On Mon, Feb 13, 2023 at 03:56:22PM +0100, David Hildenbrand wrote:
>On 13.02.23 15:37, Deepak Gupta wrote:
>>On Mon, Feb 13, 2023 at 01:05:16PM +0100, David Hildenbrand wrote:
>>>On 13.02.23 05:53, Deepak Gupta wrote:
>>>>maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
>>>>VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
>>>>able memory except it can only be written by certain specific
>>>>instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
>>>>if vma is shadow stack VMA. Each arch can define which combination of VMA
>>>>flags means a shadow stack.
>>>>
>>>>Additionally pte_mkshdwstk must be provided by arch specific PTE
>>>>construction headers to create shadow stack PTEs. (in arch specific
>>>>pgtable.h).
>>>>
>>>>This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
>>>>is not selected.
>>>>
>>>>Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>>>>---
>>>> include/linux/mm.h | 23 +++++++++++++++++++++--
>>>> include/linux/pgtable.h | 4 ++++
>>>> 2 files changed, 25 insertions(+), 2 deletions(-)
>>>>
>>>>diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>index 8f857163ac89..a7705bc49bfe 100644
>>>>--- a/include/linux/mm.h
>>>>+++ b/include/linux/mm.h
>>>>@@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
>>>> void free_compound_page(struct page *page);
>>>> #ifdef CONFIG_MMU
>>>>+
>>>>+#ifdef CONFIG_USER_SHADOW_STACK
>>>>+bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
>>>>+#endif
>>>>+
>>>>+static inline bool
>>>>+is_shadow_stack_vma(struct vm_area_struct *vma)
>>>>+{
>>>>+#ifdef CONFIG_USER_SHADOW_STACK
>>>>+ return arch_is_shadow_stack_vma(vma);
>>>>+#else
>>>>+ return false;
>>>>+#endif
>>>>+}
>>>>+
>>>> /*
>>>> * Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
>>>> * servicing faults for write access. In the normal case, do always want
>>>>@@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
>>>> */
>>>> static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>>>> {
>>>>- if (likely(vma->vm_flags & VM_WRITE))
>>>>- pte = pte_mkwrite(pte);
>>>>+ if (likely(vma->vm_flags & VM_WRITE)) {
>>>>+ if (unlikely(is_shadow_stack_vma(vma)))
>>>>+ pte = pte_mkshdwstk(pte);
>>>>+ else
>>>>+ pte = pte_mkwrite(pte);
>>>>+ }
>>>> return pte;
>>>
>>>Exactly what we are trying to avoid in the x86 approach right now.
>>>Please see the x86 series on details, we shouldn't try reinventing the
>>>wheel but finding a core-mm approach that fits multiple architectures.
>>>
>>>https://lkml.kernel.org/r/20230119212317.8324-1-rick.p.edgecombe@intel.com
>>
>>Thanks David for comment here. I looked at x86 approach. This patch
>>actually written in a way which is not re-inventing wheel and is following
>>a core-mm approach that fits multiple architectures.
>>
>>Change above checks `is_shadow_stack_vma` and if it returns true then only
>>it manufactures shadow stack pte else it'll make a regular writeable mapping.
>>
>>Now if we look at `is_shadow_stack_vma` implementation, it returns false if
>>`CONFIG_USER_SHADOW_STACK` is not defined. If `CONFIG_USER_SHADOW_STACK is
>>defined then it calls `arch_is_shadow_stack_vma` which should be implemented
>>by arch specific code. This allows each architecture to define their own vma
>>flag encodings for shadow stack (riscv chooses presence of only `VM_WRITE`
>>which is analogous to choosen PTE encodings on riscv W=1,R=0,X=0)
>>
>>Additionally pte_mkshdwstk will be nop if not implemented by architecture.
>>
>>Let me know if this make sense. If I am missing something here, let me know.
>
>See the discussion in that thread. The idea is to pass a VMA to
>pte_mkwrite() and let it handle how to actually set it writable.
>
Thanks. I see. Instances where `pte_mkwrite` is directly invoked by checking
VM_WRITE and thus instead of fixing all those instance, make pte_mkwrite itself
take vma flag or vma.
I'll revise.
>--
>Thanks,
>
>David / dhildenb
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-13 20:01 ` Deepak Gupta
@ 2023-02-14 12:10 ` David Hildenbrand
2023-02-14 18:27 ` Edgecombe, Rick P
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2023-02-14 12:10 UTC (permalink / raw)
To: Deepak Gupta
Cc: linux-kernel, linux-riscv, Andrew Morton, linux-mm,
Edgecombe, Rick P
On 13.02.23 21:01, Deepak Gupta wrote:
> On Mon, Feb 13, 2023 at 03:56:22PM +0100, David Hildenbrand wrote:
>> On 13.02.23 15:37, Deepak Gupta wrote:
>>> On Mon, Feb 13, 2023 at 01:05:16PM +0100, David Hildenbrand wrote:
>>>> On 13.02.23 05:53, Deepak Gupta wrote:
>>>>> maybe_mkwrite creates PTEs with WRITE encodings for underlying arch if
>>>>> VM_WRITE is turned on in vma->vm_flags. Shadow stack memory is a write-
>>>>> able memory except it can only be written by certain specific
>>>>> instructions. This patch allows maybe_mkwrite to create shadow stack PTEs
>>>>> if vma is shadow stack VMA. Each arch can define which combination of VMA
>>>>> flags means a shadow stack.
>>>>>
>>>>> Additionally pte_mkshdwstk must be provided by arch specific PTE
>>>>> construction headers to create shadow stack PTEs. (in arch specific
>>>>> pgtable.h).
>>>>>
>>>>> This patch provides dummy/stub pte_mkshdwstk if CONFIG_USER_SHADOW_STACK
>>>>> is not selected.
>>>>>
>>>>> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>>>>> ---
>>>>> include/linux/mm.h | 23 +++++++++++++++++++++--
>>>>> include/linux/pgtable.h | 4 ++++
>>>>> 2 files changed, 25 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index 8f857163ac89..a7705bc49bfe 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -1093,6 +1093,21 @@ static inline unsigned long thp_size(struct page *page)
>>>>> void free_compound_page(struct page *page);
>>>>> #ifdef CONFIG_MMU
>>>>> +
>>>>> +#ifdef CONFIG_USER_SHADOW_STACK
>>>>> +bool arch_is_shadow_stack_vma(struct vm_area_struct *vma);
>>>>> +#endif
>>>>> +
>>>>> +static inline bool
>>>>> +is_shadow_stack_vma(struct vm_area_struct *vma)
>>>>> +{
>>>>> +#ifdef CONFIG_USER_SHADOW_STACK
>>>>> + return arch_is_shadow_stack_vma(vma);
>>>>> +#else
>>>>> + return false;
>>>>> +#endif
>>>>> +}
>>>>> +
>>>>> /*
>>>>> * Do pte_mkwrite, but only if the vma says VM_WRITE. We do this when
>>>>> * servicing faults for write access. In the normal case, do always want
>>>>> @@ -1101,8 +1116,12 @@ void free_compound_page(struct page *page);
>>>>> */
>>>>> static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>>>>> {
>>>>> - if (likely(vma->vm_flags & VM_WRITE))
>>>>> - pte = pte_mkwrite(pte);
>>>>> + if (likely(vma->vm_flags & VM_WRITE)) {
>>>>> + if (unlikely(is_shadow_stack_vma(vma)))
>>>>> + pte = pte_mkshdwstk(pte);
>>>>> + else
>>>>> + pte = pte_mkwrite(pte);
>>>>> + }
>>>>> return pte;
>>>>
>>>> Exactly what we are trying to avoid in the x86 approach right now.
>>>> Please see the x86 series on details, we shouldn't try reinventing the
>>>> wheel but finding a core-mm approach that fits multiple architectures.
>>>>
>>>> https://lkml.kernel.org/r/20230119212317.8324-1-rick.p.edgecombe@intel.com
>>>
>>> Thanks David for comment here. I looked at x86 approach. This patch
>>> actually written in a way which is not re-inventing wheel and is following
>>> a core-mm approach that fits multiple architectures.
>>>
>>> Change above checks `is_shadow_stack_vma` and if it returns true then only
>>> it manufactures shadow stack pte else it'll make a regular writeable mapping.
>>>
>>> Now if we look at `is_shadow_stack_vma` implementation, it returns false if
>>> `CONFIG_USER_SHADOW_STACK` is not defined. If `CONFIG_USER_SHADOW_STACK is
>>> defined then it calls `arch_is_shadow_stack_vma` which should be implemented
>>> by arch specific code. This allows each architecture to define their own vma
>>> flag encodings for shadow stack (riscv chooses presence of only `VM_WRITE`
>>> which is analogous to choosen PTE encodings on riscv W=1,R=0,X=0)
>>>
>>> Additionally pte_mkshdwstk will be nop if not implemented by architecture.
>>>
>>> Let me know if this make sense. If I am missing something here, let me know.
>>
>> See the discussion in that thread. The idea is to pass a VMA to
>> pte_mkwrite() and let it handle how to actually set it writable.
>>
>
> Thanks. I see. Instances where `pte_mkwrite` is directly invoked by checking
> VM_WRITE and thus instead of fixing all those instance, make pte_mkwrite itself
> take vma flag or vma.
>
> I'll revise.
Thanks, it would be great to discuss in the other threads what else you
would need to make it work for you. I assume Rick will have something to
play with soonish (Right, Rick? :) ).
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs
2023-02-14 12:10 ` David Hildenbrand
@ 2023-02-14 18:27 ` Edgecombe, Rick P
0 siblings, 0 replies; 10+ messages in thread
From: Edgecombe, Rick P @ 2023-02-14 18:27 UTC (permalink / raw)
To: david@redhat.com, debug@rivosinc.com
Cc: linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-mm@kvack.org, akpm@linux-foundation.org
On Tue, 2023-02-14 at 13:10 +0100, David Hildenbrand wrote:
> I assume Rick will have something to
> play with soonish (Right, Rick? :) ).
Yes, Deepak and I were discussing on the x86 series. I haven't heard
anything from 0-day for a few days so looking good. There was
discussion happening with Boris on the pte_modify() patch, so might
wait a day more to post a new version.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-02-14 18:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230213045351.3945824-1-debug@rivosinc.com>
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 05/20] mmap : Introducing new protection "PROT_SHADOWSTACK" for mmap Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 07/20] elf: ELF header parsing in GNU property for cfi state Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 08/20] riscv: ELF header parsing in GNU property for riscv zisslpcfi Deepak Gupta
2023-02-13 4:53 ` [PATCH v1 RFC Zisslpcfi 11/20] mmu: maybe_mkwrite updated to manufacture shadow stack PTEs Deepak Gupta
2023-02-13 12:05 ` David Hildenbrand
2023-02-13 14:37 ` Deepak Gupta
2023-02-13 14:56 ` David Hildenbrand
2023-02-13 20:01 ` Deepak Gupta
2023-02-14 12:10 ` David Hildenbrand
2023-02-14 18:27 ` Edgecombe, Rick P
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).