linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/33] riscv control-flow integrity for usermode
@ 2024-10-08 22:36 Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 01/33] mm: Introduce ARCH_HAS_USER_SHADOW_STACK Deepak Gupta
                   ` (33 more replies)
  0 siblings, 34 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, David Hildenbrand, Carlos Bilbao, Samuel Holland,
	Andrew Jones, Conor Dooley, Andy Chiu

Basics and overview
===================

Software with larger attack surfaces (e.g. network facing apps like databases,
browsers or apps relying on browser runtimes) suffer from memory corruption
issues which can be utilized by attackers to bend control flow of the program
to eventually gain control (by making their payload executable). Attackers are
able to perform such attacks by leveraging call-sites which rely on indirect
calls or return sites which rely on obtaining return address from stack memory.

To mitigate such attacks, risc-v extension zicfilp enforces that all indirect
calls must land on a landing pad instruction `lpad` else cpu will raise software
check exception (a new cpu exception cause code on riscv).
Similarly for return flow, risc-v extension zicfiss extends architecture with

- `sspush` instruction to push return address on a shadow stack
- `sspopchk` instruction to pop return address from shadow stack
  and compare with input operand (i.e. return address on stack)
- `sspopchk` to raise software check exception if comparision above
  was a mismatch
- Protection mechanism using which shadow stack is not writeable via
  regular store instructions

More information an details can be found at extensions github repo [1].

Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel
CET [3] and branch target identification (BTI) [4] on arm.
Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control
stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack.

x86 already supports shadow stack for user mode and arm64 support for GCS in
usermode [7] is ongoing.

Kernel awareness for user control flow integrity
================================================

This series picks up Samuel Holland's envcfg changes [2] as well. So if those are
being applied independently, they should be removed from this series.

Enabling:

In order to maintain compatibility and not break anything in user mode, kernel
doesn't enable control flow integrity cpu extensions on binary by default.
Instead exposes a prctl interface to enable, disable and lock the shadow stack
or landing pad feature for a task. This allows userspace (loader) to enumerate
if all objects in its address space are compiled with shadow stack and landing
pad support and accordingly enable the feature. Additionally if a subsequent
`dlopen` happens on a library, user mode can take a decision again to disable
the feature (if incoming library is not compiled with support) OR terminate the
task (if user mode policy is strict to have all objects in address space to be
compiled with control flow integirty cpu feature). prctl to enable shadow stack
results in allocating shadow stack from virtual memory and activating for user
address space. x86 and arm64 are also following same direction due to similar
reason(s).

clone/fork:

On clone and fork, cfi state for task is inherited by child. Shadow stack is
part of virtual memory and is a writeable memory from kernel perspective
(writeable via a restricted set of instructions aka shadow stack instructions)
Thus kernel changes ensure that this memory is converted into read-only when
fork/clone happens and COWed when fault is taken due to sspush, sspopchk or
ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled,
kernel will automatically allocate a shadow stack for that clone call.

map_shadow_stack:

x86 introduced `map_shadow_stack` system call to allow user space to explicitly
map shadow stack memory in its address space. It is useful to allocate shadow
for different contexts managed by a single thread (green threads or contexts)
risc-v implements this system call as well.

signal management:

If shadow stack is enabled for a task, kernel performs an asynchronous control
flow diversion to deliver the signal and eventually expects userspace to issue
sigreturn so that original execution can be resumed. Even though resume context
is prepared by kernel, it is in user space memory and is subject to memory
corruption and corruption bugs can be utilized by attacker in this race window
to perform arbitrary sigreturn and eventually bypass cfi mechanism.
Another issue is how to ensure that cfi related state on sigcontext area is not
trampled by legacy apps or apps compiled with old kernel headers.

In order to mitigate control-flow hijacting, kernel prepares a token and place
it on shadow stack before signal delivery and places address of token in
sigcontext structure. During sigreturn, kernel obtains address of token from
sigcontext struture, reads token from shadow stack and validates it and only
then allow sigreturn to succeed. Compatiblity issue is solved by adopting
dynamic sigcontext management introduced for vector extension. This series
re-factor the code little bit to allow future sigcontext management easy (as
proposed by Andy Chiu from SiFive)

config and compilation:

Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this
config option picks the kernel support for user control flow integrity. This
optin is presented only if toolchain has shadow stack and landing pad support.
And is on purpose guarded by toolchain support. Reason being that eventually
vDSO also needs to be compiled in with shadow stack and landing pad support.
vDSO compile patches are not included as of now because landing pad labeling
scheme is yet to settle for usermode runtime.

To get more information on kernel interactions with respect to
zicfilp and zicfiss, patch series adds documentation for
`zicfilp` and `zicfiss` in following:
Documentation/arch/riscv/zicfiss.rst
Documentation/arch/riscv/zicfilp.rst

How to test this series
=======================

Toolchain
---------
$ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev
$ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb  --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static"
$ make -j$(nproc)

Qemu
----
$ git clone git@github.com:deepak0414/qemu.git -b zicfilp_zicfiss_ratified_master_july11
$ cd qemu
$ mkdir build
$ cd build
$ ../configure --target-list=riscv64-softmmu
$ make -j$(nproc)

Opensbi
-------
$ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi
$ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic

Linux
-----
Running defconfig is fine. CFI is enabled by default if the toolchain
supports it.

$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc)

Branch where user cfi enabling patches are maintained
https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1

In case you're building your own rootfs using toolchain, please make sure you
pick following patch to ensure that vDSO compiled with lpad and shadow stack.

"arch/riscv: compile vdso with landing pad"

Running
-------

Modify your qemu command to have:
-bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin
-cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true

vDSO related Opens (in the flux)
=================================

I am listing these opens for laying out plan and what to expect in future
patch sets. And of course for the sake of discussion.

Shadow stack and landing pad enabling in vDSO
----------------------------------------------
vDSO must have shadow stack and landing pad support compiled in for task
to have shadow stack and landing pad support. This patch series doesn't
enable that (yet). Enabling shadow stack support in vDSO should be
straight forward (intend to do that in next versions of patch set). Enabling
landing pad support in vDSO requires some collaboration with toolchain folks
to follow a single label scheme for all object binaries. This is necessary to
ensure that all indirect call-sites are setting correct label and target landing
pads are decorated with same label scheme.

How many vDSOs
---------------
Shadow stack instructions are carved out of zimop (may be operations) and if CPU
doesn't implement zimop, they're illegal instructions. Kernel could be running on
a CPU which may or may not implement zimop. And thus kernel will have to carry 2
different vDSOs and expose the appropriate one depending on whether CPU implements
zimop or not.

References
==========
[1] - https://github.com/riscv/riscv-cfi
[2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.com/
[3] - https://lwn.net/Articles/889475/
[4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identification
[5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-introduction-intel-cet-844137.pdf
[6] - https://lwn.net/Articles/940403/
[7] - https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.org/

To: Thomas Gleixner <tglx@linutronix.de>
To: Ingo Molnar <mingo@redhat.com>
To: Borislav Petkov <bp@alien8.de>
To: Dave Hansen <dave.hansen@linux.intel.com>
To: x86@kernel.org
To: H. Peter Anvin <hpa@zytor.com>
To: Andrew Morton <akpm@linux-foundation.org>
To: Liam R. Howlett <Liam.Howlett@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Paul Walmsley <paul.walmsley@sifive.com>
To: Palmer Dabbelt <palmer@dabbelt.com>
To: Albert Ou <aou@eecs.berkeley.edu>
To: Conor Dooley <conor@kernel.org>
To: Rob Herring <robh@kernel.org>
To: Krzysztof Kozlowski <krzk+dt@kernel.org>
To: Arnd Bergmann <arnd@arndb.de>
To: Christian Brauner <brauner@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
To: Eric Biederman <ebiederm@xmission.com>
To: Kees Cook <kees@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>
To: Shuah Khan <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-riscv@lists.infradead.org
Cc: devicetree@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: alistair.francis@wdc.com
Cc: richard.henderson@linaro.org
Cc: jim.shu@sifive.com
Cc: andybnac@gmail.com
Cc: kito.cheng@sifive.com
Cc: charlie@rivosinc.com
Cc: atishp@rivosinc.com
Cc: evan@rivosinc.com
Cc: cleger@rivosinc.com
Cc: alexghiti@rivosinc.com
Cc: samitolvanen@google.com
Cc: broonie@kernel.org
Cc: rick.p.edgecombe@intel.com

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
changelog
---------

v6:
- Picked up Samuel Holland's changes as is with `envcfg` placed in
  `thread` instead of `thread_info`
- fixed unaligned newline escapes in kselftest
- cleaned up messages in kselftest and included test output in commit message
- fixed a bug in clone path reported by Zong Li
- fixed a build issue if CONFIG_RISCV_ISA_V is not selected
  (this was introduced due to re-factoring signal context
  management code)

v5:
- rebased on v6.12-rc1
- Fixed schema related issues in device tree file
- Fixed some of the documentation related issues in zicfilp/ss.rst
  (style issues and added index)
- added `SHADOW_STACK_SET_MARKER` so that implementation can define base
  of shadow stack.
- Fixed warnings on definitions added in usercfi.h when
  CONFIG_RISCV_USER_CFI is not selected.
- Adopted context header based signal handling as proposed by Andy Chiu
- Added support for enabling kernel mode access to shadow stack using
  FWFT
  (https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware-features.adoc)
- Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@rivosinc.com
  (Note: I had an issue in my workflow due to which version number wasn't
  picked up correctly while sending out patches)

v4:
- rebased on 6.11-rc6
- envcfg: Converged with Samuel Holland's patches for envcfg management on per-
thread basis.
- vma_is_shadow_stack is renamed to is_vma_shadow_stack
- picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch
- signal context: using extended context management to maintain compatibility.
- fixed `-Wmissing-prototypes` compiler warnings for prctl functions
- Documentation fixes and amending typos.
- Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/

v3:
- envcfg
  logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been
  picked on per task basis, even though CPU didn't implement it. Fixed in
   this series.

- dt-bindings
  As suggested, split into separate commit. fixed the messaging that spec is
  in public review

- arch_is_shadow_stack change
  arch_is_shadow_stack changed to vma_is_shadow_stack

- hwprobe
  zicfiss / zicfilp if present will get enumerated in hwprobe

- selftests
  As suggested, added object and binary filenames to .gitignore
  Selftest binary anyways need to be compiled with cfi enabled compiler which
  will make sure that landing pad and shadow stack are enabled. Thus removed
  separate enable/disable tests. Cleaned up tests a bit.

- Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/

v2:
- Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
  integrity for user mode programs can be compiled in the kernel.

- Enabling of control flow integrity for user programs is left to user runtime

- This patch series introduces arch agnostic `prctls` to enable shadow stack
  and indirect branch tracking. And implements them on riscv.

---
Andy Chiu (1):
      riscv: signal: abstract header saving for setup_sigcontext

Clément Léger (1):
      riscv: Add Firmware Feature SBI extensions definitions

Deepak Gupta (26):
      mm: helper `is_shadow_stack_vma` to check shadow stack vma
      riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv
      dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
      riscv: zicfiss / zicfilp enumeration
      riscv: zicfiss / zicfilp extension csr and bit definitions
      riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit
      riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
      riscv mm: manufacture shadow stack pte
      riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
      riscv mmu: write protect and shadow stack
      riscv/mm: Implement map_shadow_stack() syscall
      riscv/shstk: If needed allocate a new shadow stack on clone
      prctl: arch-agnostic prctl for indirect branch tracking
      riscv: Implements arch agnostic shadow stack prctls
      riscv: Implements arch agnostic indirect branch tracking prctls
      riscv/traps: Introduce software check exception
      riscv/signal: save and restore of shadow stack for signal
      riscv/kernel: update __show_regs to print shadow stack register
      riscv/ptrace: riscv cfi status and state via ptrace and in core files
      riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe
      riscv: enable kernel access to shadow stack memory via FWFT sbi call
      riscv: kernel command line option to opt out of user cfi
      riscv: create a config for shadow stack and landing pad instr support
      riscv: Documentation for landing pad / indirect branch tracking
      riscv: Documentation for shadow stack on riscv
      kselftest/riscv: kselftest for user mode cfi

Mark Brown (2):
      mm: Introduce ARCH_HAS_USER_SHADOW_STACK
      prctl: arch-agnostic prctl for shadow stack

Samuel Holland (3):
      riscv: Enable cbo.zero only when all harts support Zicboz
      riscv: Add support for per-thread envcfg CSR values
      riscv: Call riscv_user_isa_enable() only on the boot hart

 Documentation/arch/riscv/index.rst                 |   2 +
 Documentation/arch/riscv/zicfilp.rst               | 115 +++++
 Documentation/arch/riscv/zicfiss.rst               | 176 +++++++
 .../devicetree/bindings/riscv/extensions.yaml      |  14 +
 arch/riscv/Kconfig                                 |  21 +
 arch/riscv/include/asm/asm-prototypes.h            |   1 +
 arch/riscv/include/asm/cpufeature.h                |  15 +-
 arch/riscv/include/asm/csr.h                       |  16 +
 arch/riscv/include/asm/entry-common.h              |   2 +
 arch/riscv/include/asm/hwcap.h                     |   2 +
 arch/riscv/include/asm/mman.h                      |  24 +
 arch/riscv/include/asm/pgtable.h                   |  30 +-
 arch/riscv/include/asm/processor.h                 |   3 +
 arch/riscv/include/asm/sbi.h                       |  27 ++
 arch/riscv/include/asm/switch_to.h                 |   8 +
 arch/riscv/include/asm/thread_info.h               |   3 +
 arch/riscv/include/asm/usercfi.h                   |  89 ++++
 arch/riscv/include/asm/vector.h                    |   3 +
 arch/riscv/include/uapi/asm/hwprobe.h              |   2 +
 arch/riscv/include/uapi/asm/ptrace.h               |  22 +
 arch/riscv/include/uapi/asm/sigcontext.h           |   1 +
 arch/riscv/kernel/Makefile                         |   2 +
 arch/riscv/kernel/asm-offsets.c                    |   8 +
 arch/riscv/kernel/cpufeature.c                     |  13 +-
 arch/riscv/kernel/entry.S                          |  31 +-
 arch/riscv/kernel/head.S                           |  12 +
 arch/riscv/kernel/process.c                        |  31 +-
 arch/riscv/kernel/ptrace.c                         |  83 ++++
 arch/riscv/kernel/signal.c                         | 140 +++++-
 arch/riscv/kernel/smpboot.c                        |   2 -
 arch/riscv/kernel/suspend.c                        |   4 +-
 arch/riscv/kernel/sys_hwprobe.c                    |   2 +
 arch/riscv/kernel/sys_riscv.c                      |  10 +
 arch/riscv/kernel/traps.c                          |  42 ++
 arch/riscv/kernel/usercfi.c                        | 526 +++++++++++++++++++++
 arch/riscv/mm/init.c                               |   2 +-
 arch/riscv/mm/pgtable.c                            |  17 +
 arch/x86/Kconfig                                   |   1 +
 fs/proc/task_mmu.c                                 |   2 +-
 include/linux/cpu.h                                |   4 +
 include/linux/mm.h                                 |   5 +-
 include/uapi/asm-generic/mman.h                    |   4 +
 include/uapi/linux/elf.h                           |   1 +
 include/uapi/linux/prctl.h                         |  48 ++
 kernel/sys.c                                       |  60 +++
 mm/Kconfig                                         |   6 +
 mm/gup.c                                           |   2 +-
 mm/mmap.c                                          |   1 +
 mm/vma.h                                           |  10 +-
 tools/testing/selftests/riscv/Makefile             |   2 +-
 tools/testing/selftests/riscv/cfi/.gitignore       |   3 +
 tools/testing/selftests/riscv/cfi/Makefile         |  10 +
 tools/testing/selftests/riscv/cfi/cfi_rv_test.h    |  84 ++++
 tools/testing/selftests/riscv/cfi/riscv_cfi_test.c |  78 +++
 tools/testing/selftests/riscv/cfi/shadowstack.c    | 373 +++++++++++++++
 tools/testing/selftests/riscv/cfi/shadowstack.h    |  37 ++
 56 files changed, 2190 insertions(+), 42 deletions(-)
---
base-commit: 7d9923ee3960bdbfaa7f3a4e0ac2364e770c46ff
change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2
--
- debug



^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v6 01/33] mm: Introduce ARCH_HAS_USER_SHADOW_STACK
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma Deepak Gupta
                   ` (32 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, David Hildenbrand, Carlos Bilbao

From: Mark Brown <broonie@kernel.org>

Since multiple architectures have support for shadow stacks and we need to
select support for this feature in several places in the generic code
provide a generic config option that the architectures can select.

Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Carlos Bilbao <carlos.bilbao.osdev@gmail.com>
---
 arch/x86/Kconfig   | 1 +
 fs/proc/task_mmu.c | 2 +-
 include/linux/mm.h | 2 +-
 mm/Kconfig         | 6 ++++++
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2852fcd82cbd..8ccae77d40f7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1954,6 +1954,7 @@ config X86_USER_SHADOW_STACK
 	depends on AS_WRUSS
 	depends on X86_64
 	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_HAS_USER_SHADOW_STACK
 	select X86_CET
 	help
 	  Shadow stack protection is a hardware feature that detects function
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 72f14fd59c2d..23f875e78eae 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -971,7 +971,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
 		[ilog2(VM_UFFD_MINOR)]	= "ui",
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
-#ifdef CONFIG_X86_USER_SHADOW_STACK
+#ifdef CONFIG_ARCH_HAS_USER_SHADOW_STACK
 		[ilog2(VM_SHADOW_STACK)] = "ss",
 #endif
 #if defined(CONFIG_64BIT) || defined(CONFIG_PPC32)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecf63d2b0582..57533b9cae95 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -354,7 +354,7 @@ extern unsigned int kobjsize(const void *objp);
 #endif
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 
-#ifdef CONFIG_X86_USER_SHADOW_STACK
+#ifdef CONFIG_ARCH_HAS_USER_SHADOW_STACK
 /*
  * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of
  * support core mm.
diff --git a/mm/Kconfig b/mm/Kconfig
index 4c9f5ea13271..4b2a1ef9a161 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1296,6 +1296,12 @@ config NUMA_EMU
 	  into virtual nodes when booted with "numa=fake=N", where N is the
 	  number of nodes. This is only useful for debugging.
 
+config ARCH_HAS_USER_SHADOW_STACK
+	bool
+	help
+	  The architecture has hardware support for userspace shadow call
+          stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss).
+
 source "mm/damon/Kconfig"
 
 endmenu

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 01/33] mm: Introduce ARCH_HAS_USER_SHADOW_STACK Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-09 11:11   ` Mark Brown
  2024-10-08 22:36 ` [PATCH v6 03/33] riscv: Enable cbo.zero only when all harts support Zicboz Deepak Gupta
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

VM_SHADOW_STACK (alias to VM_HIGH_ARCH_5) is used to encode shadow stack
VMA on three architectures (x86 shadow stack, arm GCS and RISC-V shadow
stack). In case architecture doesn't implement shadow stack, it's VM_NONE
Introducing a helper `is_shadow_stack_vma` to determine shadow stack vma
or not.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 mm/gup.c |  2 +-
 mm/vma.h | 10 +++++++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a82890b46a36..8e6e14179f6c 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1282,7 +1282,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
 		    !writable_file_mapping_allowed(vma, gup_flags))
 			return -EFAULT;
 
-		if (!(vm_flags & VM_WRITE) || (vm_flags & VM_SHADOW_STACK)) {
+		if (!(vm_flags & VM_WRITE) || is_shadow_stack_vma(vm_flags)) {
 			if (!(gup_flags & FOLL_FORCE))
 				return -EFAULT;
 			/* hugetlb does not support FOLL_FORCE|FOLL_WRITE. */
diff --git a/mm/vma.h b/mm/vma.h
index 819f994cf727..0f238dc37231 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -357,7 +357,7 @@ static inline struct vm_area_struct *vma_prev_limit(struct vma_iterator *vmi,
 }
 
 /*
- * These three helpers classifies VMAs for virtual memory accounting.
+ * These four helpers classifies VMAs for virtual memory accounting.
  */
 
 /*
@@ -368,6 +368,11 @@ static inline bool is_exec_mapping(vm_flags_t flags)
 	return (flags & (VM_EXEC | VM_WRITE | VM_STACK)) == VM_EXEC;
 }
 
+static inline bool is_shadow_stack_vma(vm_flags_t vm_flags)
+{
+	return !!(vm_flags & VM_SHADOW_STACK);
+}
+
 /*
  * Stack area (including shadow stacks)
  *
@@ -376,7 +381,7 @@ static inline bool is_exec_mapping(vm_flags_t flags)
  */
 static inline bool is_stack_mapping(vm_flags_t flags)
 {
-	return ((flags & VM_STACK) == VM_STACK) || (flags & VM_SHADOW_STACK);
+	return ((flags & VM_STACK) == VM_STACK) || is_shadow_stack_vma(flags);
 }
 
 /*
@@ -387,7 +392,6 @@ static inline bool is_data_mapping(vm_flags_t flags)
 	return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE;
 }
 
-
 static inline void vma_iter_config(struct vma_iterator *vmi,
 		unsigned long index, unsigned long last)
 {

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 03/33] riscv: Enable cbo.zero only when all harts support Zicboz
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 01/33] mm: Introduce ARCH_HAS_USER_SHADOW_STACK Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 04/33] riscv: Add support for per-thread envcfg CSR values Deepak Gupta
                   ` (30 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, Samuel Holland, Andrew Jones, Conor Dooley

From: Samuel Holland <samuel.holland@sifive.com>

Currently, we enable cbo.zero for usermode on each hart that supports
the Zicboz extension. This means that the [ms]envcfg CSR value may
differ between harts. Other features, such as pointer masking and CFI,
require setting [ms]envcfg bits on a per-thread basis. The combination
of these two adds quite some complexity and overhead to context
switching, as we would need to maintain two separate masks for the
per-hart and per-thread bits. Andrew Jones, who originally added Zicboz
support, writes[1][2]:

  I've approached Zicboz the same way I would approach all
  extensions, which is to be per-hart. I'm not currently aware of
  a platform that is / will be composed of harts where some have
  Zicboz and others don't, but there's nothing stopping a platform
  like that from being built.

  So, how about we add code that confirms Zicboz is on all harts.
  If any hart does not have it, then we complain loudly and disable
  it on all the other harts. If it was just a hardware description
  bug, then it'll get fixed. If there's actually a platform which
  doesn't have Zicboz on all harts, then, when the issue is reported,
  we can decide to not support it, support it with defconfig, or
  support it under a Kconfig guard which must be enabled by the user.

Let's follow his suggested solution and require the extension to be
available on all harts, so the envcfg CSR value does not need to change
when a thread migrates between harts. Since we are doing this for all
extensions with fields in envcfg, the CSR itself only needs to be saved/
restored when it is present on all harts.

This should not be a regression as no known hardware has asymmetric
Zicboz support, but if anyone reports seeing the warning, we will
re-evaluate our solution.

Link: https://lore.kernel.org/linux-riscv/20240322-168f191eeb8479b2ea169a5e@orel/ [1]
Link: https://lore.kernel.org/linux-riscv/20240323-28943722feb57a41fb0ff488@orel/ [2]
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
 arch/riscv/kernel/cpufeature.c | 7 ++++++-
 arch/riscv/kernel/suspend.c    | 4 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3a8eeaa9310c..e560a253e99b 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -28,6 +28,8 @@
 
 #define NUM_ALPHA_EXTS ('z' - 'a' + 1)
 
+static bool any_cpu_has_zicboz;
+
 unsigned long elf_hwcap __read_mostly;
 
 /* Host ISA bitmap */
@@ -98,6 +100,7 @@ static int riscv_ext_zicboz_validate(const struct riscv_isa_ext_data *data,
 		pr_err("Zicboz disabled as cboz-block-size present, but is not a power-of-2\n");
 		return -EINVAL;
 	}
+	any_cpu_has_zicboz = true;
 	return 0;
 }
 
@@ -919,8 +922,10 @@ unsigned long riscv_get_elf_hwcap(void)
 
 void riscv_user_isa_enable(void)
 {
-	if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_ZICBOZ))
+	if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICBOZ))
 		csr_set(CSR_ENVCFG, ENVCFG_CBZE);
+	else if (any_cpu_has_zicboz)
+		pr_warn_once("Zicboz disabled as it is unavailable on some harts\n");
 }
 
 #ifdef CONFIG_RISCV_ALTERNATIVE
diff --git a/arch/riscv/kernel/suspend.c b/arch/riscv/kernel/suspend.c
index c8cec0cc5833..9a8a0dc035b2 100644
--- a/arch/riscv/kernel/suspend.c
+++ b/arch/riscv/kernel/suspend.c
@@ -14,7 +14,7 @@
 
 void suspend_save_csrs(struct suspend_context *context)
 {
-	if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG))
+	if (riscv_has_extension_unlikely(RISCV_ISA_EXT_XLINUXENVCFG))
 		context->envcfg = csr_read(CSR_ENVCFG);
 	context->tvec = csr_read(CSR_TVEC);
 	context->ie = csr_read(CSR_IE);
@@ -37,7 +37,7 @@ void suspend_save_csrs(struct suspend_context *context)
 void suspend_restore_csrs(struct suspend_context *context)
 {
 	csr_write(CSR_SCRATCH, 0);
-	if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG))
+	if (riscv_has_extension_unlikely(RISCV_ISA_EXT_XLINUXENVCFG))
 		csr_write(CSR_ENVCFG, context->envcfg);
 	csr_write(CSR_TVEC, context->tvec);
 	csr_write(CSR_IE, context->ie);

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 04/33] riscv: Add support for per-thread envcfg CSR values
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (2 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 03/33] riscv: Enable cbo.zero only when all harts support Zicboz Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 05/33] riscv: Call riscv_user_isa_enable() only on the boot hart Deepak Gupta
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, Samuel Holland, Andrew Jones

From: Samuel Holland <samuel.holland@sifive.com>

Some bits in the [ms]envcfg CSR, such as the CFI state and pointer
masking mode, need to be controlled on a per-thread basis. Support this
by keeping a copy of the CSR value in struct thread_struct and writing
it during context switches. It is safe to discard the old CSR value
during the context switch because the CSR is modified only by software,
so the CSR will remain in sync with the copy in thread_struct.

Use ALTERNATIVE directly instead of riscv_has_extension_unlikely() to
minimize branchiness in the context switching code.

Since thread_struct is copied during fork(), setting the value for the
init task sets the default value for all other threads.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
 arch/riscv/include/asm/processor.h | 1 +
 arch/riscv/include/asm/switch_to.h | 8 ++++++++
 arch/riscv/kernel/cpufeature.c     | 2 +-
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index efa1b3519b23..c1a492508835 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -102,6 +102,7 @@ struct thread_struct {
 	unsigned long s[12];	/* s[0]: frame pointer */
 	struct __riscv_d_ext_state fstate;
 	unsigned long bad_cause;
+	unsigned long envcfg;
 	u32 riscv_v_flags;
 	u32 vstate_ctrl;
 	struct __riscv_v_ext_state vstate;
diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index 7594df37cc9f..9685cd85e57c 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -70,6 +70,13 @@ static __always_inline bool has_fpu(void) { return false; }
 #define __switch_to_fpu(__prev, __next) do { } while (0)
 #endif
 
+static inline void __switch_to_envcfg(struct task_struct *next)
+{
+	asm volatile (ALTERNATIVE("nop", "csrw " __stringify(CSR_ENVCFG) ", %0",
+				  0, RISCV_ISA_EXT_XLINUXENVCFG, 1)
+			:: "r" (next->thread.envcfg) : "memory");
+}
+
 extern struct task_struct *__switch_to(struct task_struct *,
 				       struct task_struct *);
 
@@ -103,6 +110,7 @@ do {							\
 		__switch_to_vector(__prev, __next);	\
 	if (switch_to_should_flush_icache(__next))	\
 		local_flush_icache_all();		\
+	__switch_to_envcfg(__next);			\
 	((last) = __switch_to(__prev, __next));		\
 } while (0)
 
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e560a253e99b..27bafc5dd62d 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -923,7 +923,7 @@ unsigned long riscv_get_elf_hwcap(void)
 void riscv_user_isa_enable(void)
 {
 	if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICBOZ))
-		csr_set(CSR_ENVCFG, ENVCFG_CBZE);
+		current->thread.envcfg |= ENVCFG_CBZE;
 	else if (any_cpu_has_zicboz)
 		pr_warn_once("Zicboz disabled as it is unavailable on some harts\n");
 }

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 05/33] riscv: Call riscv_user_isa_enable() only on the boot hart
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (3 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 04/33] riscv: Add support for per-thread envcfg CSR values Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv Deepak Gupta
                   ` (28 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, Samuel Holland, Andrew Jones, Conor Dooley

From: Samuel Holland <samuel.holland@sifive.com>

Now that the [ms]envcfg CSR value is maintained per thread, not per
hart, riscv_user_isa_enable() only needs to be called once during boot,
to set the value for the init task. This also allows it to be marked as
__init.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
 arch/riscv/include/asm/cpufeature.h | 2 +-
 arch/riscv/kernel/cpufeature.c      | 4 ++--
 arch/riscv/kernel/smpboot.c         | 2 --
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index 45f9c1171a48..ce9a995730c1 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -31,7 +31,7 @@ DECLARE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
 /* Per-cpu ISA extensions. */
 extern struct riscv_isainfo hart_isa[NR_CPUS];
 
-void riscv_user_isa_enable(void);
+void __init riscv_user_isa_enable(void);
 
 #define _RISCV_ISA_EXT_DATA(_name, _id, _subset_exts, _subset_exts_size, _validate) {	\
 	.name = #_name,									\
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 27bafc5dd62d..b3a057c36996 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -920,12 +920,12 @@ unsigned long riscv_get_elf_hwcap(void)
 	return hwcap;
 }
 
-void riscv_user_isa_enable(void)
+void __init riscv_user_isa_enable(void)
 {
 	if (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICBOZ))
 		current->thread.envcfg |= ENVCFG_CBZE;
 	else if (any_cpu_has_zicboz)
-		pr_warn_once("Zicboz disabled as it is unavailable on some harts\n");
+		pr_warn("Zicboz disabled as it is unavailable on some harts\n");
 }
 
 #ifdef CONFIG_RISCV_ALTERNATIVE
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 0f8f1c95ac38..e36d20205bd7 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -233,8 +233,6 @@ asmlinkage __visible void smp_callin(void)
 	numa_add_cpu(curr_cpuid);
 	set_cpu_online(curr_cpuid, true);
 
-	riscv_user_isa_enable();
-
 	/*
 	 * Remote cache and TLB flushes are ignored while the CPU is offline,
 	 * so flush them both right now just in case.

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (4 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 05/33] riscv: Call riscv_user_isa_enable() only on the boot hart Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-09 11:28   ` Mark Brown
  2024-10-08 22:36 ` [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) Deepak Gupta
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

riscv will need an implementation for exit_thread to clean up shadow stack
when thread exits. If current thread had shadow stack enabled, shadow
stack is allocated by default for any new thread.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
 arch/riscv/Kconfig          | 1 +
 arch/riscv/kernel/process.c | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 22dc5ea4196c..808ea66b9537 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -182,6 +182,7 @@ config RISCV
 	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_EXIT_THREAD
 	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index e3142d8a6e28..1f2574fb2edb 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -201,6 +201,11 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	return 0;
 }
 
+void exit_thread(struct task_struct *tsk)
+{
+
+}
+
 int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 {
 	unsigned long clone_flags = args->flags;

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (5 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-25 21:58   ` Rob Herring (Arm)
  2024-10-08 22:36 ` [PATCH v6 08/33] riscv: zicfiss / zicfilp enumeration Deepak Gupta
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Make an entry for cfi extensions in extensions.yaml.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 Documentation/devicetree/bindings/riscv/extensions.yaml | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 2cf2026cff57..356c60fd6cc8 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -368,6 +368,20 @@ properties:
             The standard Zicboz extension for cache-block zeroing as ratified
             in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs.
 
+        - const: zicfilp
+          description: |
+            The standard Zicfilp extension for enforcing forward edge
+            control-flow integrity as ratified in commit 3f8e450 ("merge
+            pull request #227 from ved-rivos/0709") of riscv-cfi
+            github repo.
+
+        - const: zicfiss
+          description: |
+            The standard Zicfiss extension for enforcing backward edge
+            control-flow integrity as ratified in commit 3f8e450 ("merge
+            pull request #227 from ved-rivos/0709") of riscv-cfi
+            github repo.
+
         - const: zicntr
           description:
             The standard Zicntr extension for base counters and timers, as

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 08/33] riscv: zicfiss / zicfilp enumeration
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (6 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 09/33] riscv: zicfiss / zicfilp extension csr and bit definitions Deepak Gupta
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

This patch adds support for detecting zicfiss and zicfilp. zicfiss and
zicfilp stands for unprivleged integer spec extension for shadow stack
and branch tracking on indirect branches, respectively.

This patch looks for zicfiss and zicfilp in device tree and accordinlgy
lights up bit in cpu feature bitmap. Furthermore this patch adds detection
utility functions to return whether shadow stack or landing pads are
supported by cpu.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/cpufeature.h | 13 +++++++++++++
 arch/riscv/include/asm/hwcap.h      |  2 ++
 arch/riscv/include/asm/processor.h  |  1 +
 arch/riscv/kernel/cpufeature.c      |  2 ++
 4 files changed, 18 insertions(+)

diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index ce9a995730c1..344b8e8cd3e8 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -8,6 +8,7 @@
 
 #include <linux/bitmap.h>
 #include <linux/jump_label.h>
+#include <linux/smp.h>
 #include <asm/hwcap.h>
 #include <asm/alternative-macros.h>
 #include <asm/errno.h>
@@ -180,4 +181,16 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi
 	return __riscv_isa_extension_available(hart_isa[cpu].isa, ext);
 }
 
+static inline bool cpu_supports_shadow_stack(void)
+{
+	return (IS_ENABLED(CONFIG_RISCV_USER_CFI) &&
+		    riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_ZICFISS));
+}
+
+static inline bool cpu_supports_indirect_br_lp_instr(void)
+{
+	return (IS_ENABLED(CONFIG_RISCV_USER_CFI) &&
+		    riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_ZICFILP));
+}
+
 #endif
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 46d9de54179e..10d315a6ef0e 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -93,6 +93,8 @@
 #define RISCV_ISA_EXT_ZCMOP		84
 #define RISCV_ISA_EXT_ZAWRS		85
 #define RISCV_ISA_EXT_SVVPTC		86
+#define RISCV_ISA_EXT_ZICFILP		87
+#define RISCV_ISA_EXT_ZICFISS		88
 
 #define RISCV_ISA_EXT_XLINUXENVCFG	127
 
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index c1a492508835..aec3466a389c 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -13,6 +13,7 @@
 #include <vdso/processor.h>
 
 #include <asm/ptrace.h>
+#include <asm/hwcap.h>
 
 #define arch_get_mmap_end(addr, len, flags)			\
 ({								\
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index b3a057c36996..70803aa66332 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -317,6 +317,8 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
 					  riscv_ext_zicbom_validate),
 	__RISCV_ISA_EXT_SUPERSET_VALIDATE(zicboz, RISCV_ISA_EXT_ZICBOZ, riscv_xlinuxenvcfg_exts,
 					  riscv_ext_zicboz_validate),
+	__RISCV_ISA_EXT_SUPERSET(zicfilp, RISCV_ISA_EXT_ZICFILP, riscv_xlinuxenvcfg_exts),
+	__RISCV_ISA_EXT_SUPERSET(zicfiss, RISCV_ISA_EXT_ZICFISS, riscv_xlinuxenvcfg_exts),
 	__RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR),
 	__RISCV_ISA_EXT_DATA(zicond, RISCV_ISA_EXT_ZICOND),
 	__RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR),

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 09/33] riscv: zicfiss / zicfilp extension csr and bit definitions
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (7 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 08/33] riscv: zicfiss / zicfilp enumeration Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 10/33] riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit Deepak Gupta
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

zicfiss and zicfilp extension gets enabled via b3 and b2 in *envcfg CSR.
menvcfg controls enabling for S/HS mode. henvcfg control enabling for VS
while senvcfg controls enabling for U/VU mode.

zicfilp extension extends *status CSR to hold `expected landing pad` bit.
A trap or interrupt can occur between an indirect jmp/call and target
instr. `expected landing pad` bit from CPU is recorded into xstatus CSR so
that when supervisor performs xret, `expected landing pad` state of CPU can
be restored.

zicfiss adds one new CSR
- CSR_SSP: CSR_SSP contains current shadow stack pointer.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
 arch/riscv/include/asm/csr.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 25966995da04..af7ed9bedaee 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -18,6 +18,15 @@
 #define SR_MPP		_AC(0x00001800, UL) /* Previously Machine */
 #define SR_SUM		_AC(0x00040000, UL) /* Supervisor User Memory Access */
 
+/* zicfilp landing pad status bit */
+#define SR_SPELP	_AC(0x00800000, UL)
+#define SR_MPELP	_AC(0x020000000000, UL)
+#ifdef CONFIG_RISCV_M_MODE
+#define SR_ELP		SR_MPELP
+#else
+#define SR_ELP		SR_SPELP
+#endif
+
 #define SR_FS		_AC(0x00006000, UL) /* Floating-point Status */
 #define SR_FS_OFF	_AC(0x00000000, UL)
 #define SR_FS_INITIAL	_AC(0x00002000, UL)
@@ -197,6 +206,8 @@
 #define ENVCFG_PBMTE			(_AC(1, ULL) << 62)
 #define ENVCFG_CBZE			(_AC(1, UL) << 7)
 #define ENVCFG_CBCFE			(_AC(1, UL) << 6)
+#define ENVCFG_LPE			(_AC(1, UL) << 2)
+#define ENVCFG_SSE			(_AC(1, UL) << 3)
 #define ENVCFG_CBIE_SHIFT		4
 #define ENVCFG_CBIE			(_AC(0x3, UL) << ENVCFG_CBIE_SHIFT)
 #define ENVCFG_CBIE_ILL			_AC(0x0, UL)
@@ -215,6 +226,11 @@
 #define SMSTATEEN0_HSENVCFG		(_ULL(1) << SMSTATEEN0_HSENVCFG_SHIFT)
 #define SMSTATEEN0_SSTATEEN0_SHIFT	63
 #define SMSTATEEN0_SSTATEEN0		(_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/*
+ * zicfiss user mode csr
+ * CSR_SSP holds current shadow stack pointer.
+ */
+#define CSR_SSP                 0x011
 
 /* symbolic CSR names: */
 #define CSR_CYCLE		0xc00

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 10/33] riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (8 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 09/33] riscv: zicfiss / zicfilp extension csr and bit definitions Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE Deepak Gupta
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Carves out space in arch specific thread struct for cfi status and shadow
stack in usermode on riscv.

This patch does following
- defines a new structure cfi_status with status bit for cfi feature
- defines shadow stack pointer, base and size in cfi_status structure
- defines offsets to new member fields in thread in asm-offsets.c
- Saves and restore shadow stack pointer on trap entry (U --> S) and exit
  (S --> U)

Shadow stack save/restore is gated on feature availiblity and implemented
using alternative. CSR can be context switched in `switch_to` as well but
soon as kernel shadow stack support gets rolled in, shadow stack pointer
will need to be switched at trap entry/exit point (much like `sp`). It can
be argued that kernel using shadow stack deployment scenario may not be as
prevalant as user mode using this feature. But even if there is some
minimal deployment of kernel shadow stack, that means that it needs to be
supported. And thus save/restore of shadow stack pointer in entry.S instead
of in `switch_to.h`.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
 arch/riscv/include/asm/processor.h   |  1 +
 arch/riscv/include/asm/thread_info.h |  3 +++
 arch/riscv/include/asm/usercfi.h     | 24 ++++++++++++++++++++++++
 arch/riscv/kernel/asm-offsets.c      |  4 ++++
 arch/riscv/kernel/entry.S            | 26 ++++++++++++++++++++++++++
 5 files changed, 58 insertions(+)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index aec3466a389c..5a8031384021 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -14,6 +14,7 @@
 
 #include <asm/ptrace.h>
 #include <asm/hwcap.h>
+#include <asm/usercfi.h>
 
 #define arch_get_mmap_end(addr, len, flags)			\
 ({								\
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index ebe52f96da34..12263cef7518 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -57,6 +57,9 @@ struct thread_info {
 	long			user_sp;	/* User stack pointer */
 	int			cpu;
 	unsigned long		syscall_work;	/* SYSCALL_WORK_ flags */
+#ifdef CONFIG_RISCV_USER_CFI
+	struct cfi_status	user_cfi_state;
+#endif
 #ifdef CONFIG_SHADOW_CALL_STACK
 	void			*scs_base;
 	void			*scs_sp;
diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h
new file mode 100644
index 000000000000..4fa201b4fc4e
--- /dev/null
+++ b/arch/riscv/include/asm/usercfi.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright (C) 2024 Rivos, Inc.
+ * Deepak Gupta <debug@rivosinc.com>
+ */
+#ifndef _ASM_RISCV_USERCFI_H
+#define _ASM_RISCV_USERCFI_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+#ifdef CONFIG_RISCV_USER_CFI
+struct cfi_status {
+	unsigned long ubcfi_en : 1; /* Enable for backward cfi. */
+	unsigned long rsvd : ((sizeof(unsigned long)*8) - 1);
+	unsigned long user_shdw_stk; /* Current user shadow stack pointer */
+	unsigned long shdw_stk_base; /* Base address of shadow stack */
+	unsigned long shdw_stk_size; /* size of shadow stack */
+};
+
+#endif /* CONFIG_RISCV_USER_CFI */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_USERCFI_H */
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index e94180ba432f..766bd33f10cb 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -52,6 +52,10 @@ void asm_offsets(void)
 #endif
 
 	OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu);
+#ifdef CONFIG_RISCV_USER_CFI
+	OFFSET(TASK_TI_CFI_STATUS, task_struct, thread_info.user_cfi_state);
+	OFFSET(TASK_TI_USER_SSP, task_struct, thread_info.user_cfi_state.user_shdw_stk);
+#endif
 	OFFSET(TASK_THREAD_F0,  task_struct, thread.fstate.f[0]);
 	OFFSET(TASK_THREAD_F1,  task_struct, thread.fstate.f[1]);
 	OFFSET(TASK_THREAD_F2,  task_struct, thread.fstate.f[2]);
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index c200d329d4bd..8f7f477517e3 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -147,6 +147,20 @@ SYM_CODE_START(handle_exception)
 
 	REG_L s0, TASK_TI_USER_SP(tp)
 	csrrc s1, CSR_STATUS, t0
+	/*
+	 * If previous mode was U, capture shadow stack pointer and save it away
+	 * Zero CSR_SSP at the same time for sanitization.
+	 */
+	ALTERNATIVE("nop; nop; nop; nop",
+				__stringify(			\
+				andi s2, s1, SR_SPP;	\
+				bnez s2, skip_ssp_save;	\
+				csrrw s2, CSR_SSP, x0;	\
+				REG_S s2, TASK_TI_USER_SSP(tp); \
+				skip_ssp_save:),
+				0,
+				RISCV_ISA_EXT_ZICFISS,
+				CONFIG_RISCV_USER_CFI)
 	csrr s2, CSR_EPC
 	csrr s3, CSR_TVAL
 	csrr s4, CSR_CAUSE
@@ -236,6 +250,18 @@ SYM_CODE_START_NOALIGN(ret_from_exception)
 	 * structures again.
 	 */
 	csrw CSR_SCRATCH, tp
+
+	/*
+	 * Going back to U mode, restore shadow stack pointer
+	 */
+	ALTERNATIVE("nop; nop",
+				__stringify(					\
+				REG_L s3, TASK_TI_USER_SSP(tp); \
+				csrw CSR_SSP, s3),
+				0,
+				RISCV_ISA_EXT_ZICFISS,
+				CONFIG_RISCV_USER_CFI)
+
 1:
 #ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE
 	move a0, sp

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (9 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 10/33] riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-09 13:36   ` Lorenzo Stoakes
  2024-10-08 22:36 ` [PATCH v6 12/33] riscv mm: manufacture shadow stack pte Deepak Gupta
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

`arch_calc_vm_prot_bits` is implemented on risc-v to return VM_READ |
VM_WRITE if PROT_WRITE is specified. Similarly `riscv_sys_mmap` is
updated to convert all incoming PROT_WRITE to (PROT_WRITE | PROT_READ).
This is to make sure that any existing apps using PROT_WRITE still work.

Earlier `protection_map[VM_WRITE]` used to pick read-write PTE encodings.
Now `protection_map[VM_WRITE]` will always pick PAGE_SHADOWSTACK PTE
encodings for shadow stack. Above changes ensure that existing apps
continue to work because underneath kernel will be picking
`protection_map[VM_WRITE|VM_READ]` PTE encodings.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/mman.h    | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/pgtable.h |  1 +
 arch/riscv/kernel/sys_riscv.c    | 10 ++++++++++
 arch/riscv/mm/init.c             |  2 +-
 mm/mmap.c                        |  1 +
 5 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/mman.h b/arch/riscv/include/asm/mman.h
new file mode 100644
index 000000000000..ef9fedf32546
--- /dev/null
+++ b/arch/riscv/include/asm/mman.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <uapi/asm/mman.h>
+
+static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
+	unsigned long pkey __always_unused)
+{
+	unsigned long ret = 0;
+
+	/*
+	 * If PROT_WRITE was specified, force it to VM_READ | VM_WRITE.
+	 * Only VM_WRITE means shadow stack.
+	 */
+	if (prot & PROT_WRITE)
+		ret = (VM_READ | VM_WRITE);
+	return ret;
+}
+#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
+
+#endif /* ! __ASM_MMAN_H__ */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index e79f15293492..4948a1f18ae8 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -177,6 +177,7 @@ extern struct pt_alloc_ops pt_ops __meminitdata;
 #define PAGE_READ_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
 #define PAGE_WRITE_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ |	\
 					 _PAGE_EXEC | _PAGE_WRITE)
+#define PAGE_SHADOWSTACK       __pgprot(_PAGE_BASE | _PAGE_WRITE)
 
 #define PAGE_COPY		PAGE_READ
 #define PAGE_COPY_EXEC		PAGE_READ_EXEC
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index d77afe05578f..43a448bf254b 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -7,6 +7,7 @@
 
 #include <linux/syscalls.h>
 #include <asm/cacheflush.h>
+#include <asm-generic/mman-common.h>
 
 static long riscv_sys_mmap(unsigned long addr, unsigned long len,
 			   unsigned long prot, unsigned long flags,
@@ -16,6 +17,15 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
 	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
 		return -EINVAL;
 
+	/*
+	 * If PROT_WRITE is specified then extend that to PROT_READ
+	 * protection_map[VM_WRITE] is now going to select shadow stack encodings.
+	 * So specifying PROT_WRITE actually should select protection_map [VM_WRITE | VM_READ]
+	 * If user wants to create shadow stack then they should use `map_shadow_stack` syscall.
+	 */
+	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
+		prot |= PROT_READ;
+
 	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
 			       offset >> (PAGE_SHIFT - page_shift_offset));
 }
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 0e8c20adcd98..964810aeb405 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -326,7 +326,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
 static const pgprot_t protection_map[16] = {
 	[VM_NONE]					= PAGE_NONE,
 	[VM_READ]					= PAGE_READ,
-	[VM_WRITE]					= PAGE_COPY,
+	[VM_WRITE]					= PAGE_SHADOWSTACK,
 	[VM_WRITE | VM_READ]				= PAGE_COPY,
 	[VM_EXEC]					= PAGE_EXEC,
 	[VM_EXEC | VM_READ]				= PAGE_READ_EXEC,
diff --git a/mm/mmap.c b/mm/mmap.c
index dd4b35a25aeb..b56f1e8cbfc6 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -47,6 +47,7 @@
 #include <linux/oom.h>
 #include <linux/sched/mm.h>
 #include <linux/ksm.h>
+#include <linux/processor.h>
 
 #include <linux/uaccess.h>
 #include <asm/cacheflush.h>

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 12/33] riscv mm: manufacture shadow stack pte
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (10 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 13/33] riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs Deepak Gupta
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

This patch implements creating shadow stack pte (on riscv). Creating
shadow stack PTE on riscv means that clearing RWX and then setting W=1.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
 arch/riscv/include/asm/pgtable.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 4948a1f18ae8..2c6edc8d04a3 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -421,6 +421,11 @@ static inline pte_t pte_mkwrite_novma(pte_t pte)
 	return __pte(pte_val(pte) | _PAGE_WRITE);
 }
 
+static inline pte_t pte_mkwrite_shstk(pte_t pte)
+{
+	return __pte((pte_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE);
+}
+
 /* static inline pte_t pte_mkexec(pte_t pte) */
 
 static inline pte_t pte_mkdirty(pte_t pte)
@@ -738,6 +743,11 @@ static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
 	return pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)));
 }
 
+static inline pmd_t pmd_mkwrite_shstk(pmd_t pte)
+{
+	return __pmd((pmd_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE);
+}
+
 static inline pmd_t pmd_wrprotect(pmd_t pmd)
 {
 	return pte_pmd(pte_wrprotect(pmd_pte(pmd)));

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 13/33] riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (11 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 12/33] riscv mm: manufacture shadow stack pte Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 14/33] riscv mmu: write protect and shadow stack Deepak Gupta
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

pte_mkwrite creates PTEs with WRITE encodings for underlying arch.
Underlying arch can have two types of writeable mappings. One that can be
written using regular store instructions. Another one that can only be
written using specialized store instructions (like shadow stack stores).
pte_mkwrite can select write PTE encoding based on VMA range (i.e.
VM_SHADOW_STACK)

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
 arch/riscv/include/asm/pgtable.h |  7 +++++++
 arch/riscv/mm/pgtable.c          | 17 +++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 2c6edc8d04a3..7963ab11d924 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -416,6 +416,10 @@ static inline pte_t pte_wrprotect(pte_t pte)
 
 /* static inline pte_t pte_mkread(pte_t pte) */
 
+struct vm_area_struct;
+pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma);
+#define pte_mkwrite pte_mkwrite
+
 static inline pte_t pte_mkwrite_novma(pte_t pte)
 {
 	return __pte(pte_val(pte) | _PAGE_WRITE);
@@ -738,6 +742,9 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
 	return pte_pmd(pte_mkyoung(pmd_pte(pmd)));
 }
 
+pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);
+#define pmd_mkwrite pmd_mkwrite
+
 static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
 {
 	return pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)));
diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c
index 4ae67324f992..be5d38546bb3 100644
--- a/arch/riscv/mm/pgtable.c
+++ b/arch/riscv/mm/pgtable.c
@@ -155,3 +155,20 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
 	return pmd;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma)
+{
+	if (vma->vm_flags & VM_SHADOW_STACK)
+		return pte_mkwrite_shstk(pte);
+
+	return pte_mkwrite_novma(pte);
+}
+
+pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma)
+{
+	if (vma->vm_flags & VM_SHADOW_STACK)
+		return pmd_mkwrite_shstk(pmd);
+
+	return pmd_mkwrite_novma(pmd);
+}
+

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 14/33] riscv mmu: write protect and shadow stack
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (12 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 13/33] riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 15/33] riscv/mm: Implement map_shadow_stack() syscall Deepak Gupta
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

`fork` implements copy on write (COW) by making pages readonly in child
and parent both.

ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
Assumption is that page is readable and on fault copy on write happens.

To implement COW on shadow stack pages, clearing up W bit makes them XWR =
000. This will result in wrong PTE setting which says no perms but V=1 and
PFN field pointing to final page. Instead desired behavior is to turn it
into a readable page, take an access (load/store) fault on sspush/sspop
(shadow stack) and then perform COW on such pages. This way regular reads
would still be allowed and not lead to COW maintaining current behavior
of COW on non-shadow stack but writeable memory.

On the other hand it doesn't interfere with existing COW for read-write
memory. Assumption is always that _PAGE_READ must have been set and thus
setting _PAGE_READ is harmless.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Alexandre Ghiti <alexghiti@rivosinc.com>
---
 arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7963ab11d924..fdab7d74437d 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -411,7 +411,7 @@ static inline int pte_devmap(pte_t pte)
 
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-	return __pte(pte_val(pte) & ~(_PAGE_WRITE));
+	return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
 }
 
 /* static inline pte_t pte_mkread(pte_t pte) */
@@ -612,7 +612,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 static inline void ptep_set_wrprotect(struct mm_struct *mm,
 				      unsigned long address, pte_t *ptep)
 {
-	atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
+	pte_t read_pte = READ_ONCE(*ptep);
+	/*
+	 * ptep_set_wrprotect can be called for shadow stack ranges too.
+	 * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
+	 * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
+	 * but we dont want this wrong configuration to be set in page tables.
+	 */
+	atomic_long_set((atomic_long_t *)ptep,
+			((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
 }
 
 #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 15/33] riscv/mm: Implement map_shadow_stack() syscall
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (13 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 14/33] riscv mmu: write protect and shadow stack Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:36 ` [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone Deepak Gupta
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

As discussed extensively in the changelog for the addition of this
syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the
existing mmap() and madvise() syscalls do not map entirely well onto the
security requirements for shadow stack memory since they lead to windows
where memory is allocated but not yet protected or stacks which are not
properly and safely initialised. Instead a new syscall map_shadow_stack()
has been defined which allocates and initialises a shadow stack page.

This patch implements this syscall for riscv. riscv doesn't require token
to be setup by kernel because user mode can do that by itself. However to
provide compatibility and portability with other architectues, user mode
can specify token set flag.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/kernel/Makefile      |   2 +
 arch/riscv/kernel/usercfi.c     | 145 ++++++++++++++++++++++++++++++++++++++++
 include/uapi/asm-generic/mman.h |   4 ++
 3 files changed, 151 insertions(+)

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 7f88cc4931f5..eb2c94dd0a9d 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -117,3 +117,5 @@ obj-$(CONFIG_COMPAT)		+= compat_vdso/
 obj-$(CONFIG_64BIT)		+= pi/
 obj-$(CONFIG_ACPI)		+= acpi.o
 obj-$(CONFIG_ACPI_NUMA)	+= acpi_numa.o
+
+obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o
diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
new file mode 100644
index 000000000000..96bb324abafb
--- /dev/null
+++ b/arch/riscv/kernel/usercfi.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2024 Rivos, Inc.
+ * Deepak Gupta <debug@rivosinc.com>
+ */
+
+#include <linux/sched.h>
+#include <linux/bitops.h>
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/uaccess.h>
+#include <linux/sizes.h>
+#include <linux/user.h>
+#include <linux/syscalls.h>
+#include <linux/prctl.h>
+#include <asm/csr.h>
+#include <asm/usercfi.h>
+
+#define SHSTK_ENTRY_SIZE sizeof(void *)
+
+/*
+ * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen
+ * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to
+ * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow
+ * stack.
+ */
+static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val)
+{
+	/*
+	 * Never expect -1 on shadow stack. Expect return addresses and zero
+	 */
+	unsigned long swap = -1;
+
+	__enable_user_access();
+	asm goto(
+		".option push\n"
+		".option arch, +zicfiss\n"
+		"1: ssamoswap.d %[swap], %[val], %[addr]\n"
+		_ASM_EXTABLE(1b, %l[fault])
+		RISCV_ACQUIRE_BARRIER
+		".option pop\n"
+		: [swap] "=r" (swap), [addr] "+A" (*addr)
+		: [val] "r" (val)
+		: "memory"
+		: fault
+		);
+	__disable_user_access();
+	return swap;
+fault:
+	__disable_user_access();
+	return -1;
+}
+
+/*
+ * Create a restore token on the shadow stack.  A token is always XLEN wide
+ * and aligned to XLEN.
+ */
+static int create_rstor_token(unsigned long ssp, unsigned long *token_addr)
+{
+	unsigned long addr;
+
+	/* Token must be aligned */
+	if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE))
+		return -EINVAL;
+
+	/* On RISC-V we're constructing token to be function of address itself */
+	addr = ssp - SHSTK_ENTRY_SIZE;
+
+	if (amo_user_shstk((unsigned long __user *)addr, (unsigned long) ssp) == -1)
+		return -EFAULT;
+
+	if (token_addr)
+		*token_addr = addr;
+
+	return 0;
+}
+
+static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size,
+				unsigned long token_offset,
+				bool set_tok)
+{
+	int flags = MAP_ANONYMOUS | MAP_PRIVATE;
+	struct mm_struct *mm = current->mm;
+	unsigned long populate, tok_loc = 0;
+
+	if (addr)
+		flags |= MAP_FIXED_NOREPLACE;
+
+	mmap_write_lock(mm);
+	addr = do_mmap(NULL, addr, size, PROT_READ, flags,
+				VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL);
+	mmap_write_unlock(mm);
+
+	if (!set_tok || IS_ERR_VALUE(addr))
+		goto out;
+
+	if (create_rstor_token(addr + token_offset, &tok_loc)) {
+		vm_munmap(addr, size);
+		return -EINVAL;
+	}
+
+	addr = tok_loc;
+
+out:
+	return addr;
+}
+
+SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags)
+{
+	bool set_tok = flags & SHADOW_STACK_SET_TOKEN;
+	unsigned long aligned_size = 0;
+
+	if (!cpu_supports_shadow_stack())
+		return -EOPNOTSUPP;
+
+	/* Anything other than set token should result in invalid param */
+	if (flags & ~SHADOW_STACK_SET_TOKEN)
+		return -EINVAL;
+
+	/*
+	 * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is available
+	 * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction
+	 * itself. This provides static property on register programming and writes to CSR can't
+	 * be unintentional from programmer's perspective. As long as programmer has guarded areas
+	 * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since
+	 * CSR_SSP is writeable by user mode, it itself can setup a shadow stack token subsequent
+	 * to allocation. Although in order to provide portablity with other architecture (because
+	 * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token
+	 * flag in flags and if provided in flags, setup a token at the base.
+	 */
+
+	/* If there isn't space for a token */
+	if (set_tok && size < SHSTK_ENTRY_SIZE)
+		return -ENOSPC;
+
+	if (addr && (addr & (PAGE_SIZE - 1)))
+		return -EINVAL;
+
+	aligned_size = PAGE_ALIGN(size);
+	if (aligned_size < size)
+		return -EOVERFLOW;
+
+	return allocate_shadow_stack(addr, aligned_size, size, set_tok);
+}
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 57e8195d0b53..9cfb3c1e337d 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -19,4 +19,8 @@
 #define MCL_FUTURE	2		/* lock all future mappings */
 #define MCL_ONFAULT	4		/* lock all pages that are faulted in */
 
+/* Set up a restore token in the shadow stack */
+#define SHADOW_STACK_SET_TOKEN (1ULL << 0)
+/* Set up a top of stack marker in the shadow stack */
+#define SHADOW_STACK_SET_MARKER (1ULL << 1)
 #endif /* __ASM_GENERIC_MMAN_H */

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (14 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 15/33] riscv/mm: Implement map_shadow_stack() syscall Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:55   ` Edgecombe, Rick P
  2024-10-08 22:36 ` [PATCH v6 17/33] prctl: arch-agnostic prctl for shadow stack Deepak Gupta
                   ` (17 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Userspace specifies CLONE_VM to share address space and spawn new thread.
`clone` allow userspace to specify a new stack for new thread. However
there is no way to specify new shadow stack base address without changing
API. This patch allocates a new shadow stack whenever CLONE_VM is given.

In case of CLONE_VFORK, parent is suspended until child finishes and thus
can child use parent shadow stack. In case of !CLONE_VM, COW kicks in
because entire address space is copied from parent to child.

`clone3` is extensible and can provide mechanisms using which shadow stack
as an input parameter can be provided. This is not settled yet and being
extensively discussed on mailing list. Once that's settled, this commit
will adapt to that.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/usercfi.h |  25 ++++++++
 arch/riscv/kernel/process.c      |  11 +++-
 arch/riscv/kernel/usercfi.c      | 121 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 156 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h
index 4fa201b4fc4e..4da9cbc8f9b5 100644
--- a/arch/riscv/include/asm/usercfi.h
+++ b/arch/riscv/include/asm/usercfi.h
@@ -8,6 +8,9 @@
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
 
+struct task_struct;
+struct kernel_clone_args;
+
 #ifdef CONFIG_RISCV_USER_CFI
 struct cfi_status {
 	unsigned long ubcfi_en : 1; /* Enable for backward cfi. */
@@ -17,6 +20,28 @@ struct cfi_status {
 	unsigned long shdw_stk_size; /* size of shadow stack */
 };
 
+unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
+							const struct kernel_clone_args *args);
+void shstk_release(struct task_struct *tsk);
+void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size);
+unsigned long get_shstk_base(struct task_struct *task, unsigned long *size);
+void set_active_shstk(struct task_struct *task, unsigned long shstk_addr);
+bool is_shstk_enabled(struct task_struct *task);
+
+#else
+
+#define shstk_alloc_thread_stack(tsk, args) 0
+
+#define shstk_release(tsk)
+
+#define get_shstk_base(task, size) 0UL
+
+#define set_shstk_base(task, shstk_addr, size)
+
+#define set_active_shstk(task, shstk_addr)
+
+#define is_shstk_enabled(task) false
+
 #endif /* CONFIG_RISCV_USER_CFI */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 1f2574fb2edb..f6f58b1ed905 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -28,6 +28,7 @@
 #include <asm/vector.h>
 #include <asm/cpufeature.h>
 #include <asm/exec.h>
+#include <asm/usercfi.h>
 
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
 #include <linux/stackprotector.h>
@@ -203,7 +204,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 
 void exit_thread(struct task_struct *tsk)
 {
-
+	shstk_release(tsk);
 }
 
 int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
@@ -211,6 +212,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 	unsigned long clone_flags = args->flags;
 	unsigned long usp = args->stack;
 	unsigned long tls = args->tls;
+	unsigned long ssp = 0;
 	struct pt_regs *childregs = task_pt_regs(p);
 
 	memset(&p->thread.s, 0, sizeof(p->thread.s));
@@ -225,11 +227,18 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 		p->thread.s[0] = (unsigned long)args->fn;
 		p->thread.s[1] = (unsigned long)args->fn_arg;
 	} else {
+		/* allocate new shadow stack if needed. In case of CLONE_VM we have to */
+		ssp = shstk_alloc_thread_stack(p, args);
+		if (IS_ERR_VALUE(ssp))
+			return PTR_ERR((void *)ssp);
+
 		*childregs = *(current_pt_regs());
 		/* Turn off status.VS */
 		riscv_v_vstate_off(childregs);
 		if (usp) /* User fork */
 			childregs->sp = usp;
+		/* if needed, set new ssp */
+		ssp ? set_active_shstk(p, ssp) : 0;
 		if (clone_flags & CLONE_SETTLS)
 			childregs->tp = tls;
 		childregs->a0 = 0; /* Return value of fork() */
diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
index 96bb324abafb..6cd166b73316 100644
--- a/arch/riscv/kernel/usercfi.c
+++ b/arch/riscv/kernel/usercfi.c
@@ -19,6 +19,41 @@
 
 #define SHSTK_ENTRY_SIZE sizeof(void *)
 
+bool is_shstk_enabled(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.ubcfi_en ? true : false;
+}
+
+void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size)
+{
+	task->thread_info.user_cfi_state.shdw_stk_base = shstk_addr;
+	task->thread_info.user_cfi_state.shdw_stk_size = size;
+}
+
+unsigned long get_shstk_base(struct task_struct *task, unsigned long *size)
+{
+	if (size)
+		*size = task->thread_info.user_cfi_state.shdw_stk_size;
+	return task->thread_info.user_cfi_state.shdw_stk_base;
+}
+
+void set_active_shstk(struct task_struct *task, unsigned long shstk_addr)
+{
+	task->thread_info.user_cfi_state.user_shdw_stk = shstk_addr;
+}
+
+/*
+ * If size is 0, then to be compatible with regular stack we want it to be as big as
+ * regular stack. Else PAGE_ALIGN it and return back
+ */
+static unsigned long calc_shstk_size(unsigned long size)
+{
+	if (size)
+		return PAGE_ALIGN(size);
+
+	return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G));
+}
+
 /*
  * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen
  * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to
@@ -143,3 +178,89 @@ SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsi
 
 	return allocate_shadow_stack(addr, aligned_size, size, set_tok);
 }
+
+/*
+ * This gets called during clone/clone3/fork. And is needed to allocate a shadow stack for
+ * cases where CLONE_VM is specified and thus a different stack is specified by user. We
+ * thus need a separate shadow stack too. How does separate shadow stack is specified by
+ * user is still being debated. Once that's settled, remove this part of the comment.
+ * This function simply returns 0 if shadow stack are not supported or if separate shadow
+ * stack allocation is not needed (like in case of !CLONE_VM)
+ */
+unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
+					   const struct kernel_clone_args *args)
+{
+	unsigned long addr, size;
+
+	/* If shadow stack is not supported, return 0 */
+	if (!cpu_supports_shadow_stack())
+		return 0;
+
+	/*
+	 * If shadow stack is not enabled on the new thread, skip any
+	 * switch to a new shadow stack.
+	 */
+	if (!is_shstk_enabled(tsk))
+		return 0;
+
+	/*
+	 * For CLONE_VFORK the child will share the parents shadow stack.
+	 * Set base = 0 and size = 0, this is special means to track this state
+	 * so the freeing logic run for child knows to leave it alone.
+	 */
+	if (args->flags & CLONE_VFORK) {
+		set_shstk_base(tsk, 0, 0);
+		return 0;
+	}
+
+	/*
+	 * For !CLONE_VM the child will use a copy of the parents shadow
+	 * stack.
+	 */
+	if (!(args->flags & CLONE_VM))
+		return 0;
+
+	/*
+	 * reaching here means, CLONE_VM was specified and thus a separate shadow
+	 * stack is needed for new cloned thread. Note: below allocation is happening
+	 * using current mm.
+	 */
+	size = calc_shstk_size(args->stack_size);
+	addr = allocate_shadow_stack(0, size, 0, false);
+	if (IS_ERR_VALUE(addr))
+		return addr;
+
+	set_shstk_base(tsk, addr, size);
+
+	return addr + size;
+}
+
+void shstk_release(struct task_struct *tsk)
+{
+	unsigned long base = 0, size = 0;
+	/* If shadow stack is not supported or not enabled, nothing to release */
+	if (!cpu_supports_shadow_stack() ||
+		!is_shstk_enabled(tsk))
+		return;
+
+	/*
+	 * When fork() with CLONE_VM fails, the child (tsk) already has a
+	 * shadow stack allocated, and exit_thread() calls this function to
+	 * free it.  In this case the parent (current) and the child share
+	 * the same mm struct. Move forward only when they're same.
+	 */
+	if (!tsk->mm || tsk->mm != current->mm)
+		return;
+
+	/*
+	 * We know shadow stack is enabled but if base is NULL, then
+	 * this task is not managing its own shadow stack (CLONE_VFORK). So
+	 * skip freeing it.
+	 */
+	base = get_shstk_base(tsk, &size);
+	if (!base)
+		return;
+
+	vm_munmap(base, size);
+	set_shstk_base(tsk, 0, 0);
+}

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 17/33] prctl: arch-agnostic prctl for shadow stack
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (15 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone Deepak Gupta
@ 2024-10-08 22:36 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking Deepak Gupta
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

From: Mark Brown <broonie@kernel.org>

Three architectures (x86, aarch64, riscv) have announced support for
shadow stacks with fairly similar functionality.  While x86 is using
arch_prctl() to control the functionality neither arm64 nor riscv uses
that interface so this patch adds arch-agnostic prctl() support to
get and set status of shadow stacks and lock the current configuration to
prevent further changes, with support for turning on and off individual
subfeatures so applications can limit their exposure to features that
they do not need.  The features are:

  - PR_SHADOW_STACK_ENABLE: Tracking and enforcement of shadow stacks,
    including allocation of a shadow stack if one is not already
    allocated.
  - PR_SHADOW_STACK_WRITE: Writes to specific addresses in the shadow
    stack.
  - PR_SHADOW_STACK_PUSH: Push additional values onto the shadow stack.
  - PR_SHADOW_STACK_DISABLE: Allow to disable shadow stack.
    Note once locked, disable must fail.

These features are expected to be inherited by new threads and cleared
on exec(), unknown features should be rejected for enable but accepted
for locking (in order to allow for future proofing).

This is based on a patch originally written by Deepak Gupta but later
modified by Mark Brown for arm's GCS patch series.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 include/linux/mm.h         |  3 +++
 include/uapi/linux/prctl.h | 21 +++++++++++++++++++++
 kernel/sys.c               | 30 ++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 57533b9cae95..54e2b3f1cc30 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4146,6 +4146,9 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn)
 {
 	return range_contains_unaccepted_memory(pfn << PAGE_SHIFT, PAGE_SIZE);
 }
+int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status);
+int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
+int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
 
 void vma_pgtable_walk_begin(struct vm_area_struct *vma);
 void vma_pgtable_walk_end(struct vm_area_struct *vma);
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 35791791a879..b8d7b6361754 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -327,5 +327,26 @@ struct prctl_mm_map {
 # define PR_PPC_DEXCR_CTRL_SET_ONEXEC	 0x8 /* Set the aspect on exec */
 # define PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC	0x10 /* Clear the aspect on exec */
 # define PR_PPC_DEXCR_CTRL_MASK		0x1f
+/*
+ * Get the current shadow stack configuration for the current thread,
+ * this will be the value configured via PR_SET_SHADOW_STACK_STATUS.
+ */
+#define PR_GET_SHADOW_STACK_STATUS      74
+
+/*
+ * Set the current shadow stack configuration.  Enabling the shadow
+ * stack will cause a shadow stack to be allocated for the thread.
+ */
+#define PR_SET_SHADOW_STACK_STATUS      75
+# define PR_SHADOW_STACK_ENABLE         (1UL << 0)
+# define PR_SHADOW_STACK_WRITE		(1UL << 1)
+# define PR_SHADOW_STACK_PUSH		(1UL << 2)
+
+/*
+ * Prevent further changes to the specified shadow stack
+ * configuration.  All bits may be locked via this call, including
+ * undefined bits.
+ */
+#define PR_LOCK_SHADOW_STACK_STATUS      76
 
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index 4da31f28fda8..3d38a9c7c5c9 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2324,6 +2324,21 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
 	return -EINVAL;
 }
 
+int __weak arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_set_shadow_stack_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
 #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
 
 #ifdef CONFIG_ANON_VMA_NAME
@@ -2784,6 +2799,21 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_RISCV_SET_ICACHE_FLUSH_CTX:
 		error = RISCV_SET_ICACHE_FLUSH_CTX(arg2, arg3);
 		break;
+	case PR_GET_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_get_shadow_stack_status(me, (unsigned long __user *) arg2);
+		break;
+	case PR_SET_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_set_shadow_stack_status(me, arg2);
+		break;
+	case PR_LOCK_SHADOW_STACK_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_lock_shadow_stack_status(me, arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (16 preceding siblings ...)
  2024-10-08 22:36 ` [PATCH v6 17/33] prctl: arch-agnostic prctl for shadow stack Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-09 11:03   ` Mark Brown
  2024-10-08 22:37 ` [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls Deepak Gupta
                   ` (15 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Three architectures (x86, aarch64, riscv) have support for indirect branch
tracking feature in a very similar fashion. On a very high level, indirect
branch tracking is a CPU feature where CPU tracks branches which uses
memory operand to perform control transfer in program. As part of this
tracking on indirect branches, CPU goes in a state where it expects a
landing pad instr on target and if not found then CPU raises some fault
(architecture dependent)

x86 landing pad instr - `ENDBRANCH`
aarch64 landing pad instr - `BTI`
riscv landing instr - `lpad`

Given that three major arches have support for indirect branch tracking,
This patch makes `prctl` for indirect branch tracking arch agnostic.

To allow userspace to enable this feature for itself, following prtcls are
defined:
 - PR_GET_INDIR_BR_LP_STATUS: Gets current configured status for indirect
   branch tracking.
 - PR_SET_INDIR_BR_LP_STATUS: Sets a configuration for indirect branch
   tracking.
   Following status options are allowed
       - PR_INDIR_BR_LP_ENABLE: Enables indirect branch tracking on user
         thread.
       - PR_INDIR_BR_LP_DISABLE; Disables indirect branch tracking on user
         thread.
 - PR_LOCK_INDIR_BR_LP_STATUS: Locks configured status for indirect branch
   tracking for user thread.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 include/linux/cpu.h        |  4 ++++
 include/uapi/linux/prctl.h | 27 +++++++++++++++++++++++++++
 kernel/sys.c               | 30 ++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index bdcec1732445..eff56aae05d7 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -203,4 +203,8 @@ static inline bool cpu_mitigations_auto_nosmt(void)
 }
 #endif
 
+int arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status);
+int arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status);
+int arch_lock_indir_br_lp_status(struct task_struct *t, unsigned long status);
+
 #endif /* _LINUX_CPU_H_ */
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index b8d7b6361754..41ffb53490a4 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -349,4 +349,31 @@ struct prctl_mm_map {
  */
 #define PR_LOCK_SHADOW_STACK_STATUS      76
 
+/*
+ * Get the current indirect branch tracking configuration for the current
+ * thread, this will be the value configured via PR_SET_INDIR_BR_LP_STATUS.
+ */
+#define PR_GET_INDIR_BR_LP_STATUS      77
+
+/*
+ * Set the indirect branch tracking configuration. PR_INDIR_BR_LP_ENABLE will
+ * enable cpu feature for user thread, to track all indirect branches and ensure
+ * they land on arch defined landing pad instruction.
+ * x86 - If enabled, an indirect branch must land on `ENDBRANCH` instruction.
+ * arch64 - If enabled, an indirect branch must land on `BTI` instruction.
+ * riscv - If enabled, an indirect branch must land on `lpad` instruction.
+ * PR_INDIR_BR_LP_DISABLE will disable feature for user thread and indirect
+ * branches will no more be tracked by cpu to land on arch defined landing pad
+ * instruction.
+ */
+#define PR_SET_INDIR_BR_LP_STATUS      78
+# define PR_INDIR_BR_LP_ENABLE		   (1UL << 0)
+
+/*
+ * Prevent further changes to the specified indirect branch tracking
+ * configuration.  All bits may be locked via this call, including
+ * undefined bits.
+ */
+#define PR_LOCK_INDIR_BR_LP_STATUS      79
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index 3d38a9c7c5c9..dafa31485584 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2339,6 +2339,21 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
 	return -EINVAL;
 }
 
+int __weak arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
+int __weak arch_lock_indir_br_lp_status(struct task_struct *t, unsigned long status)
+{
+	return -EINVAL;
+}
+
 #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
 
 #ifdef CONFIG_ANON_VMA_NAME
@@ -2814,6 +2829,21 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			return -EINVAL;
 		error = arch_lock_shadow_stack_status(me, arg2);
 		break;
+	case PR_GET_INDIR_BR_LP_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_get_indir_br_lp_status(me, (unsigned long __user *) arg2);
+		break;
+	case PR_SET_INDIR_BR_LP_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_set_indir_br_lp_status(me, arg2);
+		break;
+	case PR_LOCK_INDIR_BR_LP_STATUS:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = arch_lock_indir_br_lp_status(me, arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (17 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-09 12:44   ` Mark Brown
  2024-10-08 22:37 ` [PATCH v6 20/33] riscv: Implements arch agnostic indirect branch tracking prctls Deepak Gupta
                   ` (14 subsequent siblings)
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Implement architecture agnostic prctls() interface for setting and getting
shadow stack status.

prctls implemented are PR_GET_SHADOW_STACK_STATUS,
PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS.

As part of PR_SET_SHADOW_STACK_STATUS/PR_GET_SHADOW_STACK_STATUS, only
PR_SHADOW_STACK_ENABLE is implemented because RISCV allows each mode to
write to their own shadow stack using `sspush` or `ssamoswap`.

PR_LOCK_SHADOW_STACK_STATUS locks current configuration of shadow stack
enabling.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/usercfi.h |  18 ++++++-
 arch/riscv/kernel/process.c      |   8 +++
 arch/riscv/kernel/usercfi.c      | 107 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h
index 4da9cbc8f9b5..0b3aff008c85 100644
--- a/arch/riscv/include/asm/usercfi.h
+++ b/arch/riscv/include/asm/usercfi.h
@@ -7,6 +7,7 @@
 
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
+#include <linux/prctl.h>
 
 struct task_struct;
 struct kernel_clone_args;
@@ -14,7 +15,8 @@ struct kernel_clone_args;
 #ifdef CONFIG_RISCV_USER_CFI
 struct cfi_status {
 	unsigned long ubcfi_en : 1; /* Enable for backward cfi. */
-	unsigned long rsvd : ((sizeof(unsigned long)*8) - 1);
+	unsigned long ubcfi_locked : 1;
+	unsigned long rsvd : ((sizeof(unsigned long)*8) - 2);
 	unsigned long user_shdw_stk; /* Current user shadow stack pointer */
 	unsigned long shdw_stk_base; /* Base address of shadow stack */
 	unsigned long shdw_stk_size; /* size of shadow stack */
@@ -27,6 +29,12 @@ void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned
 unsigned long get_shstk_base(struct task_struct *task, unsigned long *size);
 void set_active_shstk(struct task_struct *task, unsigned long shstk_addr);
 bool is_shstk_enabled(struct task_struct *task);
+bool is_shstk_locked(struct task_struct *task);
+bool is_shstk_allocated(struct task_struct *task);
+void set_shstk_lock(struct task_struct *task);
+void set_shstk_status(struct task_struct *task, bool enable);
+
+#define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK (PR_SHADOW_STACK_ENABLE)
 
 #else
 
@@ -42,6 +50,14 @@ bool is_shstk_enabled(struct task_struct *task);
 
 #define is_shstk_enabled(task) false
 
+#define is_shstk_locked(task) false
+
+#define is_shstk_allocated(task) false
+
+#define set_shstk_lock(task)
+
+#define set_shstk_status(task, enable)
+
 #endif /* CONFIG_RISCV_USER_CFI */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index f6f58b1ed905..f7dec532657f 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -152,6 +152,14 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
 	regs->epc = pc;
 	regs->sp = sp;
 
+	/*
+	 * clear shadow stack state on exec.
+	 * libc will set it later via prctl.
+	 */
+	set_shstk_status(current, false);
+	set_shstk_base(current, 0, 0);
+	set_active_shstk(current, 0);
+
 #ifdef CONFIG_64BIT
 	regs->status &= ~SR_UXL;
 
diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
index 6cd166b73316..6ac5e87b4c70 100644
--- a/arch/riscv/kernel/usercfi.c
+++ b/arch/riscv/kernel/usercfi.c
@@ -24,6 +24,16 @@ bool is_shstk_enabled(struct task_struct *task)
 	return task->thread_info.user_cfi_state.ubcfi_en ? true : false;
 }
 
+bool is_shstk_allocated(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.shdw_stk_base ? true : false;
+}
+
+bool is_shstk_locked(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.ubcfi_locked ? true : false;
+}
+
 void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size)
 {
 	task->thread_info.user_cfi_state.shdw_stk_base = shstk_addr;
@@ -42,6 +52,23 @@ void set_active_shstk(struct task_struct *task, unsigned long shstk_addr)
 	task->thread_info.user_cfi_state.user_shdw_stk = shstk_addr;
 }
 
+void set_shstk_status(struct task_struct *task, bool enable)
+{
+	task->thread_info.user_cfi_state.ubcfi_en = enable ? 1 : 0;
+
+	if (enable)
+		task->thread.envcfg |= ENVCFG_SSE;
+	else
+		task->thread.envcfg &= ~ENVCFG_SSE;
+
+	csr_write(CSR_ENVCFG, task->thread.envcfg);
+}
+
+void set_shstk_lock(struct task_struct *task)
+{
+	task->thread_info.user_cfi_state.ubcfi_locked = 1;
+}
+
 /*
  * If size is 0, then to be compatible with regular stack we want it to be as big as
  * regular stack. Else PAGE_ALIGN it and return back
@@ -264,3 +291,83 @@ void shstk_release(struct task_struct *tsk)
 	vm_munmap(base, size);
 	set_shstk_base(tsk, 0, 0);
 }
+
+int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status)
+{
+	unsigned long bcfi_status = 0;
+
+	if (!cpu_supports_shadow_stack())
+		return -EINVAL;
+
+	/* this means shadow stack is enabled on the task */
+	bcfi_status |= (is_shstk_enabled(t) ? PR_SHADOW_STACK_ENABLE : 0);
+
+	return copy_to_user(status, &bcfi_status, sizeof(bcfi_status)) ? -EFAULT : 0;
+}
+
+int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status)
+{
+	unsigned long size = 0, addr = 0;
+	bool enable_shstk = false;
+
+	if (!cpu_supports_shadow_stack())
+		return -EINVAL;
+
+	/* Reject unknown flags */
+	if (status & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK)
+		return -EINVAL;
+
+	/* bcfi status is locked and further can't be modified by user */
+	if (is_shstk_locked(t))
+		return -EINVAL;
+
+	enable_shstk = status & PR_SHADOW_STACK_ENABLE;
+	/* Request is to enable shadow stack and shadow stack is not enabled already */
+	if (enable_shstk && !is_shstk_enabled(t)) {
+		/* shadow stack was allocated and enable request again
+		 * no need to support such usecase and return EINVAL.
+		 */
+		if (is_shstk_allocated(t))
+			return -EINVAL;
+
+		size = calc_shstk_size(0);
+		addr = allocate_shadow_stack(0, size, 0, false);
+		if (IS_ERR_VALUE(addr))
+			return -ENOMEM;
+		set_shstk_base(t, addr, size);
+		set_active_shstk(t, addr + size);
+	}
+
+	/*
+	 * If a request to disable shadow stack happens, let's go ahead and release it
+	 * Although, if CLONE_VFORKed child did this, then in that case we will end up
+	 * not releasing the shadow stack (because it might be needed in parent). Although
+	 * we will disable it for VFORKed child. And if VFORKed child tries to enable again
+	 * then in that case, it'll get entirely new shadow stack because following condition
+	 * are true
+	 *  - shadow stack was not enabled for vforked child
+	 *  - shadow stack base was anyways pointing to 0
+	 * This shouldn't be a big issue because we want parent to have availability of shadow
+	 * stack whenever VFORKed child releases resources via exit or exec but at the same
+	 * time we want VFORKed child to break away and establish new shadow stack if it desires
+	 *
+	 */
+	if (!enable_shstk)
+		shstk_release(t);
+
+	set_shstk_status(t, enable_shstk);
+	return 0;
+}
+
+int arch_lock_shadow_stack_status(struct task_struct *task,
+				unsigned long arg)
+{
+	/* If shtstk not supported or not enabled on task, nothing to lock here */
+	if (!cpu_supports_shadow_stack() ||
+		!is_shstk_enabled(task))
+		return -EINVAL;
+
+	set_shstk_lock(task);
+
+	return 0;
+}

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 20/33] riscv: Implements arch agnostic indirect branch tracking prctls
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (18 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 21/33] riscv/traps: Introduce software check exception Deepak Gupta
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

prctls implemented are:
PR_SET_INDIR_BR_LP_STATUS, PR_GET_INDIR_BR_LP_STATUS and
PR_LOCK_INDIR_BR_LP_STATUS.

On trap entry, ELP state is recorded in sstatus image on stack and SR_ELP
in CSR_STATUS is cleared.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/usercfi.h | 16 ++++++++-
 arch/riscv/kernel/entry.S        |  2 +-
 arch/riscv/kernel/process.c      |  5 +++
 arch/riscv/kernel/usercfi.c      | 76 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h
index 0b3aff008c85..19ee8e7e23ee 100644
--- a/arch/riscv/include/asm/usercfi.h
+++ b/arch/riscv/include/asm/usercfi.h
@@ -16,7 +16,9 @@ struct kernel_clone_args;
 struct cfi_status {
 	unsigned long ubcfi_en : 1; /* Enable for backward cfi. */
 	unsigned long ubcfi_locked : 1;
-	unsigned long rsvd : ((sizeof(unsigned long)*8) - 2);
+	unsigned long ufcfi_en : 1; /* Enable for forward cfi. Note that ELP goes in sstatus */
+	unsigned long ufcfi_locked : 1;
+	unsigned long rsvd : ((sizeof(unsigned long)*8) - 4);
 	unsigned long user_shdw_stk; /* Current user shadow stack pointer */
 	unsigned long shdw_stk_base; /* Base address of shadow stack */
 	unsigned long shdw_stk_size; /* size of shadow stack */
@@ -33,6 +35,10 @@ bool is_shstk_locked(struct task_struct *task);
 bool is_shstk_allocated(struct task_struct *task);
 void set_shstk_lock(struct task_struct *task);
 void set_shstk_status(struct task_struct *task, bool enable);
+bool is_indir_lp_enabled(struct task_struct *task);
+bool is_indir_lp_locked(struct task_struct *task);
+void set_indir_lp_status(struct task_struct *task, bool enable);
+void set_indir_lp_lock(struct task_struct *task);
 
 #define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK (PR_SHADOW_STACK_ENABLE)
 
@@ -58,6 +64,14 @@ void set_shstk_status(struct task_struct *task, bool enable);
 
 #define set_shstk_status(task, enable)
 
+#define is_indir_lp_enabled(task) false
+
+#define is_indir_lp_locked(task) false
+
+#define set_indir_lp_status(task, enable)
+
+#define set_indir_lp_lock(task)
+
 #endif /* CONFIG_RISCV_USER_CFI */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 8f7f477517e3..a1f258fd7bbc 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -143,7 +143,7 @@ SYM_CODE_START(handle_exception)
 	 * Disable the FPU/Vector to detect illegal usage of floating point
 	 * or vector in kernel space.
 	 */
-	li t0, SR_SUM | SR_FS_VS
+	li t0, SR_SUM | SR_FS_VS | SR_ELP
 
 	REG_L s0, TASK_TI_USER_SP(tp)
 	csrrc s1, CSR_STATUS, t0
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index f7dec532657f..5207f018415c 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -159,6 +159,11 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
 	set_shstk_status(current, false);
 	set_shstk_base(current, 0, 0);
 	set_active_shstk(current, 0);
+	/*
+	 * disable indirect branch tracking on exec.
+	 * libc will enable it later via prctl.
+	 */
+	set_indir_lp_status(current, false);
 
 #ifdef CONFIG_64BIT
 	regs->status &= ~SR_UXL;
diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
index 6ac5e87b4c70..21ea2237efcf 100644
--- a/arch/riscv/kernel/usercfi.c
+++ b/arch/riscv/kernel/usercfi.c
@@ -69,6 +69,32 @@ void set_shstk_lock(struct task_struct *task)
 	task->thread_info.user_cfi_state.ubcfi_locked = 1;
 }
 
+bool is_indir_lp_enabled(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.ufcfi_en ? true : false;
+}
+
+bool is_indir_lp_locked(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.ufcfi_locked ? true : false;
+}
+
+void set_indir_lp_status(struct task_struct *task, bool enable)
+{
+	task->thread_info.user_cfi_state.ufcfi_en = enable ? 1 : 0;
+
+	if (enable)
+		task->thread.envcfg |= ENVCFG_LPE;
+	else
+		task->thread.envcfg &= ~ENVCFG_LPE;
+
+	csr_write(CSR_ENVCFG, task->thread.envcfg);
+}
+
+void set_indir_lp_lock(struct task_struct *task)
+{
+	task->thread_info.user_cfi_state.ufcfi_locked = 1;
+}
 /*
  * If size is 0, then to be compatible with regular stack we want it to be as big as
  * regular stack. Else PAGE_ALIGN it and return back
@@ -371,3 +397,53 @@ int arch_lock_shadow_stack_status(struct task_struct *task,
 
 	return 0;
 }
+
+int arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status)
+{
+	unsigned long fcfi_status = 0;
+
+	if (!cpu_supports_indirect_br_lp_instr())
+		return -EINVAL;
+
+	/* indirect branch tracking is enabled on the task or not */
+	fcfi_status |= (is_indir_lp_enabled(t) ? PR_INDIR_BR_LP_ENABLE : 0);
+
+	return copy_to_user(status, &fcfi_status, sizeof(fcfi_status)) ? -EFAULT : 0;
+}
+
+int arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status)
+{
+	bool enable_indir_lp = false;
+
+	if (!cpu_supports_indirect_br_lp_instr())
+		return -EINVAL;
+
+	/* indirect branch tracking is locked and further can't be modified by user */
+	if (is_indir_lp_locked(t))
+		return -EINVAL;
+
+	/* Reject unknown flags */
+	if (status & ~PR_INDIR_BR_LP_ENABLE)
+		return -EINVAL;
+
+	enable_indir_lp = (status & PR_INDIR_BR_LP_ENABLE) ? true : false;
+	set_indir_lp_status(t, enable_indir_lp);
+
+	return 0;
+}
+
+int arch_lock_indir_br_lp_status(struct task_struct *task,
+				unsigned long arg)
+{
+	/*
+	 * If indirect branch tracking is not supported or not enabled on task,
+	 * nothing to lock here
+	 */
+	if (!cpu_supports_indirect_br_lp_instr() ||
+		!is_indir_lp_enabled(task))
+		return -EINVAL;
+
+	set_indir_lp_lock(task);
+
+	return 0;
+}

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 21/33] riscv/traps: Introduce software check exception
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (19 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 20/33] riscv: Implements arch agnostic indirect branch tracking prctls Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 22/33] riscv: signal: abstract header saving for setup_sigcontext Deepak Gupta
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

zicfiss / zicfilp introduces a new exception to priv isa `software check
exception` with cause code = 18. This patch implements software check
exception.

Additionally it implements a cfi violation handler which checks for code
in xtval. If xtval=2, it means that sw check exception happened because of
an indirect branch not landing on 4 byte aligned PC or not landing on
`lpad` instruction or label value embedded in `lpad` not matching label
value setup in `x7`. If xtval=3, it means that sw check exception happened
because of mismatch between link register (x1 or x5) and top of shadow
stack (on execution of `sspopchk`).

In case of cfi violation, SIGSEGV is raised with code=SEGV_CPERR.
SEGV_CPERR was introduced by x86 shadow stack patches.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/asm-prototypes.h |  1 +
 arch/riscv/include/asm/entry-common.h   |  2 ++
 arch/riscv/kernel/entry.S               |  3 +++
 arch/riscv/kernel/traps.c               | 42 +++++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+)

diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h
index cd627ec289f1..5a27cefd7805 100644
--- a/arch/riscv/include/asm/asm-prototypes.h
+++ b/arch/riscv/include/asm/asm-prototypes.h
@@ -51,6 +51,7 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_u);
 DECLARE_DO_ERROR_INFO(do_trap_ecall_s);
 DECLARE_DO_ERROR_INFO(do_trap_ecall_m);
 DECLARE_DO_ERROR_INFO(do_trap_break);
+DECLARE_DO_ERROR_INFO(do_trap_software_check);
 
 asmlinkage void handle_bad_stack(struct pt_regs *regs);
 asmlinkage void do_page_fault(struct pt_regs *regs);
diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h
index 2293e535f865..4068c7e5452a 100644
--- a/arch/riscv/include/asm/entry-common.h
+++ b/arch/riscv/include/asm/entry-common.h
@@ -39,4 +39,6 @@ static inline int handle_misaligned_store(struct pt_regs *regs)
 }
 #endif
 
+bool handle_user_cfi_violation(struct pt_regs *regs);
+
 #endif /* _ASM_RISCV_ENTRY_COMMON_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index a1f258fd7bbc..aaef4604d841 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -471,6 +471,9 @@ SYM_DATA_START_LOCAL(excp_vect_table)
 	RISCV_PTR do_page_fault   /* load page fault */
 	RISCV_PTR do_trap_unknown
 	RISCV_PTR do_page_fault   /* store page fault */
+	RISCV_PTR do_trap_unknown /* cause=16 */
+	RISCV_PTR do_trap_unknown /* cause=17 */
+	RISCV_PTR do_trap_software_check /* cause=18 is sw check exception */
 SYM_DATA_END_LABEL(excp_vect_table, SYM_L_LOCAL, excp_vect_table_end)
 
 #ifndef CONFIG_MMU
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 51ebfd23e007..225b1d198ab6 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -354,6 +354,48 @@ void do_trap_ecall_u(struct pt_regs *regs)
 
 }
 
+#define CFI_TVAL_FCFI_CODE	2
+#define CFI_TVAL_BCFI_CODE	3
+/* handle cfi violations */
+bool handle_user_cfi_violation(struct pt_regs *regs)
+{
+	bool ret = false;
+	unsigned long tval = csr_read(CSR_TVAL);
+
+	if (((tval == CFI_TVAL_FCFI_CODE) && cpu_supports_indirect_br_lp_instr()) ||
+		((tval == CFI_TVAL_BCFI_CODE) && cpu_supports_shadow_stack())) {
+		do_trap_error(regs, SIGSEGV, SEGV_CPERR, regs->epc,
+					  "Oops - control flow violation");
+		ret = true;
+	}
+
+	return ret;
+}
+/*
+ * software check exception is defined with risc-v cfi spec. Software check
+ * exception is raised when:-
+ * a) An indirect branch doesn't land on 4 byte aligned PC or `lpad`
+ *    instruction or `label` value programmed in `lpad` instr doesn't
+ *    match with value setup in `x7`. reported code in `xtval` is 2.
+ * b) `sspopchk` instruction finds a mismatch between top of shadow stack (ssp)
+ *    and x1/x5. reported code in `xtval` is 3.
+ */
+asmlinkage __visible __trap_section void do_trap_software_check(struct pt_regs *regs)
+{
+	if (user_mode(regs)) {
+		irqentry_enter_from_user_mode(regs);
+
+		/* not a cfi violation, then merge into flow of unknown trap handler */
+		if (!handle_user_cfi_violation(regs))
+			do_trap_unknown(regs);
+
+		irqentry_exit_to_user_mode(regs);
+	} else {
+		/* sw check exception coming from kernel is a bug in kernel */
+		die(regs, "Kernel BUG");
+	}
+}
+
 #ifdef CONFIG_MMU
 asmlinkage __visible noinstr void do_page_fault(struct pt_regs *regs)
 {

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 22/33] riscv: signal: abstract header saving for setup_sigcontext
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (20 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 21/33] riscv/traps: Introduce software check exception Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 23/33] riscv/signal: save and restore of shadow stack for signal Deepak Gupta
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, Andy Chiu

From: Andy Chiu <andy.chiu@sifive.com>

The function save_v_state() served two purposes. First, it saved
extension context into the signal stack. Then, it constructed the
extension header if there was no fault. The second part is independent
of the extension itself. As a result, we can pull that part out, so
future extensions may reuse it. This patch adds arch_ext_list and makes
setup_sigcontext() go through all possible extensions' save() callback.
The callback returns a positive value indicating the size of the
successfully saved extension. Then the kernel proceeds to construct the
header for that extension. The kernel skips an extension if it does
not exist, or if the saving fails for some reasons. The error code is
propagated out on the later case.

This patch does not introduce any functional changes.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
---
 arch/riscv/include/asm/vector.h |  3 +++
 arch/riscv/kernel/signal.c      | 60 ++++++++++++++++++++++++++---------------
 2 files changed, 42 insertions(+), 21 deletions(-)

diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index be7d309cca8a..2d2ec6ca3abb 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -281,6 +281,9 @@ static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }
 #define riscv_v_thread_free(tsk)		do {} while (0)
 #define  riscv_v_setup_ctx_cache()		do {} while (0)
 #define riscv_v_thread_alloc(tsk)		do {} while (0)
+#define get_cpu_vector_context()		do {} while (0)
+#define put_cpu_vector_context()		do {} while (0)
+#define riscv_v_vstate_set_restore(task, regs)	do {} while (0)
 
 #endif /* CONFIG_RISCV_ISA_V */
 
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index dcd282419456..014ac1024b85 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -68,18 +68,18 @@ static long save_fp_state(struct pt_regs *regs,
 #define restore_fp_state(task, regs) (0)
 #endif
 
-#ifdef CONFIG_RISCV_ISA_V
-
-static long save_v_state(struct pt_regs *regs, void __user **sc_vec)
+static long save_v_state(struct pt_regs *regs, void __user *sc_vec)
 {
-	struct __riscv_ctx_hdr __user *hdr;
 	struct __sc_riscv_v_state __user *state;
 	void __user *datap;
 	long err;
 
-	hdr = *sc_vec;
-	/* Place state to the user's signal context space after the hdr */
-	state = (struct __sc_riscv_v_state __user *)(hdr + 1);
+	if (!IS_ENABLED(CONFIG_RISCV_ISA_V) ||
+		!(has_vector() && riscv_v_vstate_query(regs)))
+		return 0;
+
+	/* Place state to the user's signal context spac */
+	state = (struct __sc_riscv_v_state __user *)sc_vec;
 	/* Point datap right after the end of __sc_riscv_v_state */
 	datap = state + 1;
 
@@ -97,15 +97,11 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec)
 	err |= __put_user((__force void *)datap, &state->v_state.datap);
 	/* Copy the whole vector content to user space datap. */
 	err |= __copy_to_user(datap, current->thread.vstate.datap, riscv_v_vsize);
-	/* Copy magic to the user space after saving  all vector conetext */
-	err |= __put_user(RISCV_V_MAGIC, &hdr->magic);
-	err |= __put_user(riscv_v_sc_size, &hdr->size);
 	if (unlikely(err))
-		return err;
+		return -EFAULT;
 
-	/* Only progress the sv_vec if everything has done successfully  */
-	*sc_vec += riscv_v_sc_size;
-	return 0;
+	/* Only return the size if everything has done successfully  */
+	return riscv_v_sc_size;
 }
 
 /*
@@ -142,10 +138,19 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
 	 */
 	return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
 }
-#else
-#define save_v_state(task, regs) (0)
-#define __restore_v_state(task, regs) (0)
-#endif
+
+struct arch_ext_priv {
+	__u32 magic;
+	long (*save)(struct pt_regs *regs, void __user *sc_vec);
+};
+
+struct arch_ext_priv arch_ext_list[] = {
+	{
+		.magic = RISCV_V_MAGIC,
+		.save = &save_v_state,
+	},
+};
+const size_t nr_arch_exts = ARRAY_SIZE(arch_ext_list);
 
 static long restore_sigcontext(struct pt_regs *regs,
 	struct sigcontext __user *sc)
@@ -276,7 +281,8 @@ static long setup_sigcontext(struct rt_sigframe __user *frame,
 {
 	struct sigcontext __user *sc = &frame->uc.uc_mcontext;
 	struct __riscv_ctx_hdr __user *sc_ext_ptr = &sc->sc_extdesc.hdr;
-	long err;
+	struct arch_ext_priv *arch_ext;
+	long err, i, ext_size;
 
 	/* sc_regs is structured the same as the start of pt_regs */
 	err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs));
@@ -284,8 +290,20 @@ static long setup_sigcontext(struct rt_sigframe __user *frame,
 	if (has_fpu())
 		err |= save_fp_state(regs, &sc->sc_fpregs);
 	/* Save the vector state. */
-	if (has_vector() && riscv_v_vstate_query(regs))
-		err |= save_v_state(regs, (void __user **)&sc_ext_ptr);
+	for (i = 0; i < nr_arch_exts; i++) {
+		arch_ext = &arch_ext_list[i];
+		if (!arch_ext->save)
+			continue;
+
+		ext_size = arch_ext->save(regs, sc_ext_ptr + 1);
+		if (ext_size <= 0) {
+			err |= ext_size;
+		} else {
+			err |= __put_user(arch_ext->magic, &sc_ext_ptr->magic);
+			err |= __put_user(ext_size, &sc_ext_ptr->size);
+			sc_ext_ptr = (void *)sc_ext_ptr + ext_size;
+		}
+	}
 	/* Write zero to fp-reserved space and check it on restore_sigcontext */
 	err |= __put_user(0, &sc->sc_extdesc.reserved);
 	/* And put END __riscv_ctx_hdr at the end. */

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 23/33] riscv/signal: save and restore of shadow stack for signal
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (21 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 22/33] riscv: signal: abstract header saving for setup_sigcontext Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 24/33] riscv/kernel: update __show_regs to print shadow stack register Deepak Gupta
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta, Andy Chiu

Save shadow stack pointer in sigcontext structure while delivering signal.
Restore shadow stack pointer from sigcontext on sigreturn.

As part of save operation, kernel uses `ssamoswap` to save snapshot of
current shadow stack on shadow stack itself (can be called as a save
token). During restore on sigreturn, kernel retrieves token from top of
shadow stack and validates it. This allows that user mode can't arbitrary
pivot to any shadow stack address without having a token and thus provide
strong security assurance between signaly delivery and sigreturn window.

Use ABI compatible way of saving/restoring shadow stack pointer into
signal stack. This follows what Vector extension, where extra registers
are placed in a form of extension header + extension body in the stack.
The extension header indicates the size of the extra architectural
states plus the size of header itself, and a magic identifier of the
extension. Then, the extensions body contains the new architectural
states in the form defined by uapi.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/usercfi.h         | 10 ++++
 arch/riscv/include/uapi/asm/ptrace.h     |  4 ++
 arch/riscv/include/uapi/asm/sigcontext.h |  1 +
 arch/riscv/kernel/signal.c               | 80 ++++++++++++++++++++++++++++++++
 arch/riscv/kernel/usercfi.c              | 57 +++++++++++++++++++++++
 5 files changed, 152 insertions(+)

diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h
index 19ee8e7e23ee..fe58b13b5fa6 100644
--- a/arch/riscv/include/asm/usercfi.h
+++ b/arch/riscv/include/asm/usercfi.h
@@ -8,6 +8,7 @@
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
 #include <linux/prctl.h>
+#include <linux/errno.h>
 
 struct task_struct;
 struct kernel_clone_args;
@@ -35,6 +36,9 @@ bool is_shstk_locked(struct task_struct *task);
 bool is_shstk_allocated(struct task_struct *task);
 void set_shstk_lock(struct task_struct *task);
 void set_shstk_status(struct task_struct *task, bool enable);
+unsigned long get_active_shstk(struct task_struct *task);
+int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr);
+int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr);
 bool is_indir_lp_enabled(struct task_struct *task);
 bool is_indir_lp_locked(struct task_struct *task);
 void set_indir_lp_status(struct task_struct *task, bool enable);
@@ -72,6 +76,12 @@ void set_indir_lp_lock(struct task_struct *task);
 
 #define set_indir_lp_lock(task)
 
+#define restore_user_shstk(tsk, shstk_ptr) -EINVAL
+
+#define save_user_shstk(tsk, saved_shstk_ptr) -EINVAL
+
+#define get_active_shstk(task) 0UL
+
 #endif /* CONFIG_RISCV_USER_CFI */
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h
index a38268b19c3d..659ea3af5680 100644
--- a/arch/riscv/include/uapi/asm/ptrace.h
+++ b/arch/riscv/include/uapi/asm/ptrace.h
@@ -127,6 +127,10 @@ struct __riscv_v_regset_state {
  */
 #define RISCV_MAX_VLENB (8192)
 
+struct __sc_riscv_cfi_state {
+	unsigned long ss_ptr;   /* shadow stack pointer */
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_RISCV_PTRACE_H */
diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h
index cd4f175dc837..f37e4beffe03 100644
--- a/arch/riscv/include/uapi/asm/sigcontext.h
+++ b/arch/riscv/include/uapi/asm/sigcontext.h
@@ -10,6 +10,7 @@
 
 /* The Magic number for signal context frame header. */
 #define RISCV_V_MAGIC	0x53465457
+#define RISCV_ZICFISS_MAGIC		0x9487
 #define END_MAGIC	0x0
 
 /* The size of END signal context header. */
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index 014ac1024b85..77cbc4a01e49 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -22,11 +22,13 @@
 #include <asm/vector.h>
 #include <asm/csr.h>
 #include <asm/cacheflush.h>
+#include <asm/usercfi.h>
 
 unsigned long signal_minsigstksz __ro_after_init;
 
 extern u32 __user_rt_sigreturn[2];
 static size_t riscv_v_sc_size __ro_after_init;
+static size_t riscv_zicfiss_sc_size __ro_after_init;
 
 #define DEBUG_SIG 0
 
@@ -139,6 +141,62 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
 	return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
 }
 
+static long save_cfiss_state(struct pt_regs *regs, void __user *sc_cfi)
+{
+	struct __sc_riscv_cfi_state __user *state = sc_cfi;
+	unsigned long ss_ptr = 0;
+	long err = 0;
+
+	if (!IS_ENABLED(CONFIG_RISCV_USER_CFI) || !is_shstk_enabled(current))
+		return 0;
+
+	/*
+	 * Save a pointer to shadow stack itself on shadow stack as a form of token.
+	 * A token on shadow gives following properties
+	 * - Safe save and restore for shadow stack switching. Any save of shadow stack
+	 *   must have had saved a token on shadow stack. Similarly any restore of shadow
+	 *   stack must check the token before restore. Since writing to shadow stack with
+	 *   address of shadow stack itself is not easily allowed. A restore without a save
+	 *   is quite difficult for an attacker to perform.
+	 * - A natural break. A token in shadow stack provides a natural break in shadow stack
+	 *   So a single linear range can be bucketed into different shadow stack segments. Any
+	 *   sspopchk will detect the condition and fault to kernel as sw check exception.
+	 */
+	err |= save_user_shstk(current, &ss_ptr);
+	err |= __put_user(ss_ptr, &state->ss_ptr);
+	if (unlikely(err))
+		return -EFAULT;
+
+	return riscv_zicfiss_sc_size;
+}
+
+static long __restore_cfiss_state(struct pt_regs *regs, void __user *sc_cfi)
+{
+	struct __sc_riscv_cfi_state __user *state = sc_cfi;
+	unsigned long ss_ptr = 0;
+	long err;
+
+	/*
+	 * Restore shadow stack as a form of token stored on shadow stack itself as a safe
+	 * way to restore.
+	 * A token on shadow gives following properties
+	 * - Safe save and restore for shadow stack switching. Any save of shadow stack
+	 *   must have had saved a token on shadow stack. Similarly any restore of shadow
+	 *   stack must check the token before restore. Since writing to shadow stack with
+	 *   address of shadow stack itself is not easily allowed. A restore without a save
+	 *   is quite difficult for an attacker to perform.
+	 * - A natural break. A token in shadow stack provides a natural break in shadow stack
+	 *   So a single linear range can be bucketed into different shadow stack segments.
+	 *   sspopchk will detect the condition and fault to kernel as sw check exception.
+	 */
+	err = __copy_from_user(&ss_ptr, &state->ss_ptr, sizeof(unsigned long));
+
+	if (unlikely(err))
+		return err;
+
+	return restore_user_shstk(current, ss_ptr);
+}
+
 struct arch_ext_priv {
 	__u32 magic;
 	long (*save)(struct pt_regs *regs, void __user *sc_vec);
@@ -149,6 +207,10 @@ struct arch_ext_priv arch_ext_list[] = {
 		.magic = RISCV_V_MAGIC,
 		.save = &save_v_state,
 	},
+	{
+		.magic = RISCV_ZICFISS_MAGIC,
+		.save = &save_cfiss_state,
+	},
 };
 const size_t nr_arch_exts = ARRAY_SIZE(arch_ext_list);
 
@@ -200,6 +262,12 @@ static long restore_sigcontext(struct pt_regs *regs,
 
 			err = __restore_v_state(regs, sc_ext_ptr);
 			break;
+		case RISCV_ZICFISS_MAGIC:
+			if (!is_shstk_enabled(current) || size != riscv_zicfiss_sc_size)
+				return -EINVAL;
+
+			err = __restore_cfiss_state(regs, sc_ext_ptr);
+			break;
 		default:
 			return -EINVAL;
 		}
@@ -220,6 +288,10 @@ static size_t get_rt_frame_size(bool cal_all)
 		if (cal_all || riscv_v_vstate_query(task_pt_regs(current)))
 			total_context_size += riscv_v_sc_size;
 	}
+
+	if (is_shstk_enabled(current))
+		total_context_size += riscv_zicfiss_sc_size;
+
 	/*
 	 * Preserved a __riscv_ctx_hdr for END signal context header if an
 	 * extension uses __riscv_extra_ext_header
@@ -363,6 +435,11 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 #ifdef CONFIG_MMU
 	regs->ra = (unsigned long)VDSO_SYMBOL(
 		current->mm->context.vdso, rt_sigreturn);
+
+	/* if bcfi is enabled x1 (ra) and x5 (t0) must match. not sure if we need this? */
+	if (is_shstk_enabled(current))
+		regs->t0 = regs->ra;
+
 #else
 	/*
 	 * For the nommu case we don't have a VDSO.  Instead we push two
@@ -491,6 +568,9 @@ void __init init_rt_signal_env(void)
 {
 	riscv_v_sc_size = sizeof(struct __riscv_ctx_hdr) +
 			  sizeof(struct __sc_riscv_v_state) + riscv_v_vsize;
+
+	riscv_zicfiss_sc_size = sizeof(struct __riscv_ctx_hdr) +
+			  sizeof(struct __sc_riscv_cfi_state);
 	/*
 	 * Determine the stack space required for guaranteed signal delivery.
 	 * The signal_minsigstksz will be populated into the AT_MINSIGSTKSZ entry
diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
index 21ea2237efcf..92d03eb76c03 100644
--- a/arch/riscv/kernel/usercfi.c
+++ b/arch/riscv/kernel/usercfi.c
@@ -52,6 +52,11 @@ void set_active_shstk(struct task_struct *task, unsigned long shstk_addr)
 	task->thread_info.user_cfi_state.user_shdw_stk = shstk_addr;
 }
 
+unsigned long get_active_shstk(struct task_struct *task)
+{
+	return task->thread_info.user_cfi_state.user_shdw_stk;
+}
+
 void set_shstk_status(struct task_struct *task, bool enable)
 {
 	task->thread_info.user_cfi_state.ubcfi_en = enable ? 1 : 0;
@@ -164,6 +169,58 @@ static int create_rstor_token(unsigned long ssp, unsigned long *token_addr)
 	return 0;
 }
 
+/*
+ * Save user shadow stack pointer on shadow stack itself and return pointer to saved location
+ * returns -EFAULT if operation was unsuccessful
+ */
+int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr)
+{
+	unsigned long ss_ptr = 0;
+	unsigned long token_loc = 0;
+	int ret = 0;
+
+	if (saved_shstk_ptr == NULL)
+		return -EINVAL;
+
+	ss_ptr = get_active_shstk(tsk);
+	ret = create_rstor_token(ss_ptr, &token_loc);
+
+	if (!ret) {
+		*saved_shstk_ptr = token_loc;
+		set_active_shstk(tsk, token_loc);
+	}
+
+	return ret;
+}
+
+/*
+ * Restores user shadow stack pointer from token on shadow stack for task `tsk`
+ * returns -EFAULT if operation was unsuccessful
+ */
+int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr)
+{
+	unsigned long token = 0;
+
+	token = amo_user_shstk((unsigned long __user *)shstk_ptr, 0);
+
+	if (token == -1)
+		return -EFAULT;
+
+	/* invalid token, return EINVAL */
+	if ((token - shstk_ptr) != SHSTK_ENTRY_SIZE) {
+		pr_info_ratelimited(
+				"%s[%d]: bad restore token in %s: pc=%p sp=%p, token=%p, shstk_ptr=%p\n",
+				tsk->comm, task_pid_nr(tsk), __func__,
+				(void *)(task_pt_regs(tsk)->epc), (void *)(task_pt_regs(tsk)->sp),
+				(void *)token, (void *)shstk_ptr);
+		return -EINVAL;
+	}
+
+	/* all checks passed, set active shstk and return success */
+	set_active_shstk(tsk, token);
+	return 0;
+}
+
 static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size,
 				unsigned long token_offset,
 				bool set_tok)

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 24/33] riscv/kernel: update __show_regs to print shadow stack register
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (22 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 23/33] riscv/signal: save and restore of shadow stack for signal Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 25/33] riscv/ptrace: riscv cfi status and state via ptrace and in core files Deepak Gupta
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Updating __show_regs to print captured shadow stack pointer as well.
On tasks where shadow stack is disabled, it'll simply print 0.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
 arch/riscv/kernel/process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 5207f018415c..6db0fde3701e 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -89,8 +89,8 @@ void __show_regs(struct pt_regs *regs)
 		regs->s8, regs->s9, regs->s10);
 	pr_cont(" s11: " REG_FMT " t3 : " REG_FMT " t4 : " REG_FMT "\n",
 		regs->s11, regs->t3, regs->t4);
-	pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT "\n",
-		regs->t5, regs->t6);
+	pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT " ssp : " REG_FMT "\n",
+		regs->t5, regs->t6, get_active_shstk(current));
 
 	pr_cont("status: " REG_FMT " badaddr: " REG_FMT " cause: " REG_FMT "\n",
 		regs->status, regs->badaddr, regs->cause);

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 25/33] riscv/ptrace: riscv cfi status and state via ptrace and in core files
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (23 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 24/33] riscv/kernel: update __show_regs to print shadow stack register Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 26/33] riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe Deepak Gupta
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Expose a new register type NT_RISCV_USER_CFI for risc-v cfi status and
state. Intentionally both landing pad and shadow stack status and state
are rolled into cfi state. Creating two different NT_RISCV_USER_XXX would
not be useful and wastage of a note type. Enabling or disabling of feature
is not allowed via ptrace set interface. However setting `elp` state or
setting shadow stack pointer are allowed via ptrace set interface. It is
expected `gdb` might have use to fixup `elp` state or `shadow stack`
pointer.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/uapi/asm/ptrace.h | 18 ++++++++
 arch/riscv/kernel/ptrace.c           | 83 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/elf.h             |  1 +
 3 files changed, 102 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h
index 659ea3af5680..e6571fba8a8a 100644
--- a/arch/riscv/include/uapi/asm/ptrace.h
+++ b/arch/riscv/include/uapi/asm/ptrace.h
@@ -131,6 +131,24 @@ struct __sc_riscv_cfi_state {
 	unsigned long ss_ptr;   /* shadow stack pointer */
 };
 
+struct __cfi_status {
+	/* indirect branch tracking state */
+	__u64 lp_en : 1;
+	__u64 lp_lock : 1;
+	__u64 elp_state : 1;
+
+	/* shadow stack status */
+	__u64 shstk_en : 1;
+	__u64 shstk_lock : 1;
+
+	__u64 rsvd : sizeof(__u64) - 5;
+};
+
+struct user_cfi_state {
+	struct __cfi_status	cfi_status;
+	__u64 shstk_ptr;
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_RISCV_PTRACE_H */
diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index 92731ff8c79a..c69b20ea6e79 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -19,6 +19,7 @@
 #include <linux/regset.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
+#include <asm/usercfi.h>
 
 enum riscv_regset {
 	REGSET_X,
@@ -28,6 +29,9 @@ enum riscv_regset {
 #ifdef CONFIG_RISCV_ISA_V
 	REGSET_V,
 #endif
+#ifdef CONFIG_RISCV_USER_CFI
+	REGSET_CFI,
+#endif
 };
 
 static int riscv_gpr_get(struct task_struct *target,
@@ -152,6 +156,75 @@ static int riscv_vr_set(struct task_struct *target,
 }
 #endif
 
+#ifdef CONFIG_RISCV_USER_CFI
+static int riscv_cfi_get(struct task_struct *target,
+			const struct user_regset *regset,
+			struct membuf to)
+{
+	struct user_cfi_state user_cfi;
+	struct pt_regs *regs;
+
+	regs = task_pt_regs(target);
+
+	user_cfi.cfi_status.lp_en = is_indir_lp_enabled(target);
+	user_cfi.cfi_status.lp_lock = is_indir_lp_locked(target);
+	user_cfi.cfi_status.elp_state = (regs->status & SR_ELP);
+
+	user_cfi.cfi_status.shstk_en = is_shstk_enabled(target);
+	user_cfi.cfi_status.shstk_lock = is_shstk_locked(target);
+	user_cfi.shstk_ptr = get_active_shstk(target);
+
+	return membuf_write(&to, &user_cfi, sizeof(user_cfi));
+}
+
+/*
+ * Does it make sense to allowing enable / disable of cfi via ptrace?
+ * Not allowing enable / disable / locking control via ptrace for now.
+ * Setting shadow stack pointer is allowed. GDB might use it to unwind or
+ * some other fixup. Similarly gdb might want to suppress elp and may want
+ * to reset elp state.
+ */
+static int riscv_cfi_set(struct task_struct *target,
+			const struct user_regset *regset,
+			unsigned int pos, unsigned int count,
+			const void *kbuf, const void __user *ubuf)
+{
+	int ret;
+	struct user_cfi_state user_cfi;
+	struct pt_regs *regs;
+
+	regs = task_pt_regs(target);
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_cfi, 0, -1);
+	if (ret)
+		return ret;
+
+	/*
+	 * Not allowing enabling or locking shadow stack or landing pad
+	 * There is no disabling of shadow stack or landing pad via ptrace
+	 * rsvd field should be set to zero so that if those fields are needed in future
+	 */
+	if (user_cfi.cfi_status.lp_en || user_cfi.cfi_status.lp_lock ||
+		user_cfi.cfi_status.shstk_en || user_cfi.cfi_status.shstk_lock ||
+		!user_cfi.cfi_status.rsvd)
+		return -EINVAL;
+
+	/* If lpad is enabled on target and ptrace requests to set / clear elp, do that */
+	if (is_indir_lp_enabled(target)) {
+		if (user_cfi.cfi_status.elp_state) /* set elp state */
+			regs->status |= SR_ELP;
+		else
+			regs->status &= ~SR_ELP; /* clear elp state */
+	}
+
+	/* If shadow stack enabled on target, set new shadow stack pointer */
+	if (is_shstk_enabled(target))
+		set_active_shstk(target, user_cfi.shstk_ptr);
+
+	return 0;
+}
+#endif
+
 static const struct user_regset riscv_user_regset[] = {
 	[REGSET_X] = {
 		.core_note_type = NT_PRSTATUS,
@@ -182,6 +255,16 @@ static const struct user_regset riscv_user_regset[] = {
 		.set = riscv_vr_set,
 	},
 #endif
+#ifdef CONFIG_RISCV_USER_CFI
+	[REGSET_CFI] = {
+		.core_note_type = NT_RISCV_USER_CFI,
+		.align = sizeof(__u64),
+		.n = sizeof(struct user_cfi_state) / sizeof(__u64),
+		.size = sizeof(__u64),
+		.regset_get = riscv_cfi_get,
+		.set = riscv_cfi_set,
+	}
+#endif
 };
 
 static const struct user_regset_view riscv_user_native_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index b9935988da5c..7ef63b2b67a1 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -450,6 +450,7 @@ typedef struct elf64_shdr {
 #define NT_MIPS_MSA	0x802		/* MIPS SIMD registers */
 #define NT_RISCV_CSR	0x900		/* RISC-V Control and Status Registers */
 #define NT_RISCV_VECTOR	0x901		/* RISC-V vector registers */
+#define NT_RISCV_USER_CFI	0x902		/* RISC-V shadow stack state */
 #define NT_LOONGARCH_CPUCFG	0xa00	/* LoongArch CPU config registers */
 #define NT_LOONGARCH_CSR	0xa01	/* LoongArch control and status registers */
 #define NT_LOONGARCH_LSX	0xa02	/* LoongArch Loongson SIMD Extension registers */

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 26/33] riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (24 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 25/33] riscv/ptrace: riscv cfi status and state via ptrace and in core files Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 27/33] riscv: Add Firmware Feature SBI extensions definitions Deepak Gupta
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Adding enumeration of zicfilp and zicfiss extensions in hwprobe syscall.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/uapi/asm/hwprobe.h | 2 ++
 arch/riscv/kernel/sys_hwprobe.c       | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
index 1e153cda57db..d5c5dec9ae6c 100644
--- a/arch/riscv/include/uapi/asm/hwprobe.h
+++ b/arch/riscv/include/uapi/asm/hwprobe.h
@@ -72,6 +72,8 @@ struct riscv_hwprobe {
 #define		RISCV_HWPROBE_EXT_ZCF		(1ULL << 46)
 #define		RISCV_HWPROBE_EXT_ZCMOP		(1ULL << 47)
 #define		RISCV_HWPROBE_EXT_ZAWRS		(1ULL << 48)
+#define		RISCV_HWPROBE_EXT_ZICFILP	(1ULL << 49)
+#define		RISCV_HWPROBE_EXT_ZICFISS	(1ULL << 50)
 #define RISCV_HWPROBE_KEY_CPUPERF_0	5
 #define		RISCV_HWPROBE_MISALIGNED_UNKNOWN	(0 << 0)
 #define		RISCV_HWPROBE_MISALIGNED_EMULATED	(1 << 0)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index cea0ca2bf2a2..98f72ad7124f 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -107,6 +107,8 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair,
 		EXT_KEY(ZCB);
 		EXT_KEY(ZCMOP);
 		EXT_KEY(ZICBOZ);
+		EXT_KEY(ZICFILP);
+		EXT_KEY(ZICFISS);
 		EXT_KEY(ZICOND);
 		EXT_KEY(ZIHINTNTL);
 		EXT_KEY(ZIHINTPAUSE);

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 27/33] riscv: Add Firmware Feature SBI extensions definitions
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (25 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 26/33] riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 28/33] riscv: enable kernel access to shadow stack memory via FWFT sbi call Deepak Gupta
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

From: Clément Léger <cleger@rivosinc.com>

Add necessary SBI definitions to use the FWFT extension.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
---
 arch/riscv/include/asm/sbi.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 98f631b051db..754e5cdabf46 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -34,6 +34,7 @@ enum sbi_ext_id {
 	SBI_EXT_PMU = 0x504D55,
 	SBI_EXT_DBCN = 0x4442434E,
 	SBI_EXT_STA = 0x535441,
+	SBI_EXT_FWFT = 0x46574654,
 
 	/* Experimentals extensions must lie within this range */
 	SBI_EXT_EXPERIMENTAL_START = 0x08000000,
@@ -281,6 +282,32 @@ struct sbi_sta_struct {
 
 #define SBI_SHMEM_DISABLE		-1
 
+/* SBI function IDs for FW feature extension */
+#define SBI_EXT_FWFT_SET		0x0
+#define SBI_EXT_FWFT_GET		0x1
+
+enum sbi_fwft_feature_t {
+	SBI_FWFT_MISALIGNED_EXC_DELEG		= 0x0,
+	SBI_FWFT_LANDING_PAD			= 0x1,
+	SBI_FWFT_SHADOW_STACK			= 0x2,
+	SBI_FWFT_DOUBLE_TRAP			= 0x3,
+	SBI_FWFT_PTE_AD_HW_UPDATING		= 0x4,
+	SBI_FWFT_LOCAL_RESERVED_START		= 0x5,
+	SBI_FWFT_LOCAL_RESERVED_END		= 0x3fffffff,
+	SBI_FWFT_LOCAL_PLATFORM_START		= 0x40000000,
+	SBI_FWFT_LOCAL_PLATFORM_END		= 0x7fffffff,
+
+	SBI_FWFT_GLOBAL_RESERVED_START		= 0x80000000,
+	SBI_FWFT_GLOBAL_RESERVED_END		= 0xbfffffff,
+	SBI_FWFT_GLOBAL_PLATFORM_START		= 0xc0000000,
+	SBI_FWFT_GLOBAL_PLATFORM_END		= 0xffffffff,
+};
+
+#define SBI_FWFT_GLOBAL_FEATURE_BIT		(1 << 31)
+#define SBI_FWFT_PLATFORM_FEATURE_BIT		(1 << 30)
+
+#define SBI_FWFT_SET_FLAG_LOCK			(1 << 0)
+
 /* SBI spec version fields */
 #define SBI_SPEC_VERSION_DEFAULT	0x1
 #define SBI_SPEC_VERSION_MAJOR_SHIFT	24

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 28/33] riscv: enable kernel access to shadow stack memory via FWFT sbi call
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (26 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 27/33] riscv: Add Firmware Feature SBI extensions definitions Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 29/33] riscv: kernel command line option to opt out of user cfi Deepak Gupta
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Kernel will have to perform shadow stack operations on user shadow stack.
Like during signal delivery and sigreturn, shadow stack token must be
created and validated respectively. Thus shadow stack access for kernel
must be enabled.

In future when kernel shadow stacks are enabled for linux kernel, it must
be enabled as early as possible for better coverage and prevent imbalance
between regular stack and shadow stack. After `relocate_enable_mmu` has
been done, this is as early as possible it can enabled.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/kernel/asm-offsets.c |  4 ++++
 arch/riscv/kernel/head.S        | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 766bd33f10cb..a22ab8a41672 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -517,4 +517,8 @@ void asm_offsets(void)
 	DEFINE(FREGS_A6,	    offsetof(struct ftrace_regs, a6));
 	DEFINE(FREGS_A7,	    offsetof(struct ftrace_regs, a7));
 #endif
+	DEFINE(SBI_EXT_FWFT, SBI_EXT_FWFT);
+	DEFINE(SBI_EXT_FWFT_SET, SBI_EXT_FWFT_SET);
+	DEFINE(SBI_FWFT_SHADOW_STACK, SBI_FWFT_SHADOW_STACK);
+	DEFINE(SBI_FWFT_SET_FLAG_LOCK, SBI_FWFT_SET_FLAG_LOCK);
 }
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 356d5397b2a2..6244408ca917 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -164,6 +164,12 @@ secondary_start_sbi:
 	call relocate_enable_mmu
 #endif
 	call .Lsetup_trap_vector
+	li a7, SBI_EXT_FWFT
+	li a6, SBI_EXT_FWFT_SET
+	li a0, SBI_FWFT_SHADOW_STACK
+	li a1, 1 /* enable supervisor to access shadow stack access */
+	li a2, SBI_FWFT_SET_FLAG_LOCK
+	ecall
 	scs_load_current
 	call smp_callin
 #endif /* CONFIG_SMP */
@@ -320,6 +326,12 @@ SYM_CODE_START(_start_kernel)
 	la tp, init_task
 	la sp, init_thread_union + THREAD_SIZE
 	addi sp, sp, -PT_SIZE_ON_STACK
+	li a7, SBI_EXT_FWFT
+	li a6, SBI_EXT_FWFT_SET
+	li a0, SBI_FWFT_SHADOW_STACK
+	li a1, 1 /* enable supervisor to access shadow stack access */
+	li a2, SBI_FWFT_SET_FLAG_LOCK
+	ecall
 	scs_load_current
 
 #ifdef CONFIG_KASAN

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 29/33] riscv: kernel command line option to opt out of user cfi
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (27 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 28/33] riscv: enable kernel access to shadow stack memory via FWFT sbi call Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 30/33] riscv: create a config for shadow stack and landing pad instr support Deepak Gupta
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

This commit adds a kernel command line option using which user cfi can be
disabled.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/kernel/usercfi.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c
index 92d03eb76c03..fb17a67568a8 100644
--- a/arch/riscv/kernel/usercfi.c
+++ b/arch/riscv/kernel/usercfi.c
@@ -17,6 +17,8 @@
 #include <asm/csr.h>
 #include <asm/usercfi.h>
 
+bool disable_riscv_usercfi;
+
 #define SHSTK_ENTRY_SIZE sizeof(void *)
 
 bool is_shstk_enabled(struct task_struct *task)
@@ -393,6 +395,9 @@ int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status)
 	unsigned long size = 0, addr = 0;
 	bool enable_shstk = false;
 
+	if (disable_riscv_usercfi)
+		return 0;
+
 	if (!cpu_supports_shadow_stack())
 		return -EINVAL;
 
@@ -472,6 +477,9 @@ int arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status)
 {
 	bool enable_indir_lp = false;
 
+	if (disable_riscv_usercfi)
+		return 0;
+
 	if (!cpu_supports_indirect_br_lp_instr())
 		return -EINVAL;
 
@@ -504,3 +512,15 @@ int arch_lock_indir_br_lp_status(struct task_struct *task,
 
 	return 0;
 }
+
+static int __init setup_global_riscv_enable(char *str)
+{
+	if (strcmp(str, "true") == 0)
+		disable_riscv_usercfi = true;
+
+	pr_info("Setting riscv usercfi to be %s\n", (disable_riscv_usercfi ? "disabled" : "enabled"));
+
+	return 1;
+}
+
+__setup("disable_riscv_usercfi=", setup_global_riscv_enable);

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 30/33] riscv: create a config for shadow stack and landing pad instr support
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (28 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 29/33] riscv: kernel command line option to opt out of user cfi Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 31/33] riscv: Documentation for landing pad / indirect branch tracking Deepak Gupta
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

This patch creates a config for shadow stack support and landing pad instr
support. Shadow stack support and landing instr support can be enabled by
selecting `CONFIG_RISCV_USER_CFI`. Selecting `CONFIG_RISCV_USER_CFI` wires
up path to enumerate CPU support and if cpu support exists, kernel will
support cpu assisted user mode cfi.

If CONFIG_RISCV_USER_CFI is selected, select `ARCH_USES_HIGH_VMA_FLAGS`,
`ARCH_HAS_USER_SHADOW_STACK` and DYNAMIC_SIGFRAME for riscv.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/Kconfig | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 808ea66b9537..1335dbe91ab9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -245,6 +245,26 @@ config ARCH_HAS_BROKEN_DWARF5
 	# https://github.com/llvm/llvm-project/commit/7ffabb61a5569444b5ac9322e22e5471cc5e4a77
 	depends on LD_IS_LLD && LLD_VERSION < 180000
 
+config RISCV_USER_CFI
+	def_bool y
+	bool "riscv userspace control flow integrity"
+	depends on 64BIT && $(cc-option,-mabi=lp64 -march=rv64ima_zicfiss)
+	depends on RISCV_ALTERNATIVE
+	select ARCH_HAS_USER_SHADOW_STACK
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select DYNAMIC_SIGFRAME
+	help
+	  Provides CPU assisted control flow integrity to userspace tasks.
+	  Control flow integrity is provided by implementing shadow stack for
+	  backward edge and indirect branch tracking for forward edge in program.
+	  Shadow stack protection is a hardware feature that detects function
+	  return address corruption. This helps mitigate ROP attacks.
+	  Indirect branch tracking enforces that all indirect branches must land
+	  on a landing pad instruction else CPU will fault. This mitigates against
+	  JOP / COP attacks. Applications must be enabled to use it, and old user-
+	  space does not get protection "for free".
+	  default y
+
 config ARCH_MMAP_RND_BITS_MIN
 	default 18 if 64BIT
 	default 8

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 31/33] riscv: Documentation for landing pad / indirect branch tracking
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (29 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 30/33] riscv: create a config for shadow stack and landing pad instr support Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 32/33] riscv: Documentation for shadow stack on riscv Deepak Gupta
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Adding documentation on landing pad aka indirect branch tracking on riscv
and kernel interfaces exposed so that user tasks can enable it.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 Documentation/arch/riscv/index.rst   |   1 +
 Documentation/arch/riscv/zicfilp.rst | 115 +++++++++++++++++++++++++++++++++++
 2 files changed, 116 insertions(+)

diff --git a/Documentation/arch/riscv/index.rst b/Documentation/arch/riscv/index.rst
index eecf347ce849..be7237b69682 100644
--- a/Documentation/arch/riscv/index.rst
+++ b/Documentation/arch/riscv/index.rst
@@ -14,6 +14,7 @@ RISC-V architecture
     uabi
     vector
     cmodx
+    zicfilp
 
     features
 
diff --git a/Documentation/arch/riscv/zicfilp.rst b/Documentation/arch/riscv/zicfilp.rst
new file mode 100644
index 000000000000..a188d78fcde6
--- /dev/null
+++ b/Documentation/arch/riscv/zicfilp.rst
@@ -0,0 +1,115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+:Author: Deepak Gupta <debug@rivosinc.com>
+:Date:   12 January 2024
+
+====================================================
+Tracking indirect control transfers on RISC-V Linux
+====================================================
+
+This document briefly describes the interface provided to userspace by Linux
+to enable indirect branch tracking for user mode applications on RISV-V
+
+1. Feature Overview
+--------------------
+
+Memory corruption issues usually result in to crashes, however when in hands of
+an adversary and if used creatively can result into variety security issues.
+
+One of those security issues can be code re-use attacks on program where adversary
+can use corrupt function pointers and chain them together to perform jump oriented
+programming (JOP) or call oriented programming (COP) and thus compromising control
+flow integrity (CFI) of the program.
+
+Function pointers live in read-write memory and thus are susceptible to corruption
+and allows an adversary to reach any program counter (PC) in address space. On
+RISC-V zicfilp extension enforces a restriction on such indirect control
+transfers:
+
+- indirect control transfers must land on a landing pad instruction ``lpad``.
+  There are two exception to this rule:
+
+  - rs1 = x1 or rs1 = x5, i.e. a return from a function and returns are
+    protected using shadow stack (see zicfiss.rst)
+
+  - rs1 = x7. On RISC-V compiler usually does below to reach function
+    which is beyond the offset possible J-type instruction::
+
+      auipc x7, <imm>
+      jalr (x7)
+
+	Such form of indirect control transfer are still immutable and don't rely
+    on memory and thus rs1=x7 is exempted from tracking and considered software
+    guarded jumps.
+
+``lpad`` instruction is pseudo of ``auipc rd, <imm_20bit>`` with ``rd=x0`` and
+is a HINT nop. ``lpad`` instruction must be aligned on 4 byte boundary and
+compares 20 bit immediate withx7. If ``imm_20bit`` == 0, CPU don't perform any
+comparision with ``x7``. If ``imm_20bit`` != 0, then ``imm_20bit`` must match
+``x7`` else CPU will raise ``software check exception`` (``cause=18``) with
+``*tval = 2``.
+
+Compiler can generate a hash over function signatures and setup them (truncated
+to 20bit) in x7 at callsites and function prologues can have ``lpad`` with same
+function hash. This further reduces number of program counters a call site can
+reach.
+
+2. ELF and psABI
+-----------------
+
+Toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_FCFI` for property
+:c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in notes section of the object file.
+
+3. Linux enabling
+------------------
+
+User space programs can have multiple shared objects loaded in its address space
+and it's a difficult task to make sure all the dependencies have been compiled
+with support of indirect branch. Thus it's left to dynamic loader to enable
+indirect branch tracking for the program.
+
+4. prctl() enabling
+--------------------
+
+:c:macro:`PR_SET_INDIR_BR_LP_STATUS` / :c:macro:`PR_GET_INDIR_BR_LP_STATUS` /
+:c:macro:`PR_LOCK_INDIR_BR_LP_STATUS` are three prctls added to manage indirect
+branch tracking. prctls are arch agnostic and returns -EINVAL on other arches.
+
+* prctl(PR_SET_INDIR_BR_LP_STATUS, unsigned long arg)
+
+If arg1 is :c:macro:`PR_INDIR_BR_LP_ENABLE` and if CPU supports ``zicfilp``
+then kernel will enabled indirect branch tracking for the task. Dynamic loader
+can issue this :c:macro:`prctl` once it has determined that all the objects
+loaded in address space support indirect branch tracking. Additionally if there
+is a `dlopen` to an object which wasn't compiled with ``zicfilp``, dynamic
+loader can issue this prctl with arg1 set to 0 (i.e.
+:c:macro:`PR_INDIR_BR_LP_ENABLE` being clear)
+
+* prctl(PR_GET_INDIR_BR_LP_STATUS, unsigned long arg)
+
+Returns current status of indirect branch tracking. If enabled it'll return
+:c:macro:`PR_INDIR_BR_LP_ENABLE`
+
+* prctl(PR_LOCK_INDIR_BR_LP_STATUS, unsigned long arg)
+
+Locks current status of indirect branch tracking on the task. User space may
+want to run with strict security posture and wouldn't want loading of objects
+without ``zicfilp`` support in it and thus would want to disallow disabling of
+indirect branch tracking. In that case user space can use this prctl to lock
+current settings.
+
+5. violations related to indirect branch tracking
+--------------------------------------------------
+
+Pertaining to indirect branch tracking, CPU raises software check exception in
+following conditions:
+
+- missing ``lpad`` after indirect call / jmp
+- ``lpad`` not on 4 byte boundary
+- ``imm_20bit`` embedded in ``lpad`` instruction doesn't match with ``x7``
+
+In all 3 cases, ``*tval = 2`` is captured and software check exception is
+raised (``cause=18``)
+
+Linux kernel will treat this as :c:macro:`SIGSEV`` with code =
+:c:macro:`SEGV_CPERR` and follow normal course of signal delivery.

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 32/33] riscv: Documentation for shadow stack on riscv
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (30 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 31/33] riscv: Documentation for landing pad / indirect branch tracking Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-08 22:37 ` [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi Deepak Gupta
  2024-10-09 11:05 ` [PATCH v6 00/33] riscv control-flow integrity for usermode Mark Brown
  33 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Adding documentation on shadow stack for user mode on riscv and kernel
interfaces exposed so that user tasks can enable it.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 Documentation/arch/riscv/index.rst   |   1 +
 Documentation/arch/riscv/zicfiss.rst | 176 +++++++++++++++++++++++++++++++++++
 2 files changed, 177 insertions(+)

diff --git a/Documentation/arch/riscv/index.rst b/Documentation/arch/riscv/index.rst
index be7237b69682..e240eb0ceb70 100644
--- a/Documentation/arch/riscv/index.rst
+++ b/Documentation/arch/riscv/index.rst
@@ -15,6 +15,7 @@ RISC-V architecture
     vector
     cmodx
     zicfilp
+    zicfiss
 
     features
 
diff --git a/Documentation/arch/riscv/zicfiss.rst b/Documentation/arch/riscv/zicfiss.rst
new file mode 100644
index 000000000000..5ba389f15b3f
--- /dev/null
+++ b/Documentation/arch/riscv/zicfiss.rst
@@ -0,0 +1,176 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+:Author: Deepak Gupta <debug@rivosinc.com>
+:Date:   12 January 2024
+
+=========================================================
+Shadow stack to protect function returns on RISC-V Linux
+=========================================================
+
+This document briefly describes the interface provided to userspace by Linux
+to enable shadow stack for user mode applications on RISV-V
+
+1. Feature Overview
+--------------------
+
+Memory corruption issues usually result in to crashes, however when in hands of
+an adversary and if used creatively can result into variety security issues.
+
+One of those security issues can be code re-use attacks on program where
+adversary can use corrupt return addresses present on stack and chain them
+together to perform return oriented programming (ROP) and thus compromising
+control flow integrity (CFI) of the program.
+
+Return addresses live on stack and thus in read-write memory and thus are
+susceptible to corruption and allows an adversary to reach any program counter
+(PC) in address space. On RISC-V ``zicfiss`` extension provides an alternate
+stack termed as shadow stack on which return addresses can be safely placed in
+prolog of the function and retrieved in epilog. ``zicfiss`` extension makes
+following changes:
+
+- PTE encodings for shadow stack virtual memory
+  An earlier reserved encoding in first stage translation i.e.
+  PTE.R=0, PTE.W=1, PTE.X=0  becomes PTE encoding for shadow stack pages.
+
+- ``sspush x1/x5`` instruction pushes (stores) ``x1/x5`` to shadow stack.
+
+- ``sspopchk x1/x5`` instruction pops (loads) from shadow stack and compares
+  with ``x1/x5`` and if un-equal, CPU raises ``software check exception`` with
+  ``*tval = 3``
+
+Compiler toolchain makes sure that function prologue have ``sspush x1/x5`` to
+save return address on shadow stack in addition to regular stack. Similarly
+function epilogs have ``ld x5, offset(x2)`` followed by ``sspopchk x5`` to
+ensure that popped value from regular stack matches with popped value from
+shadow stack.
+
+2. Shadow stack protections and linux memory manager
+-----------------------------------------------------
+
+As mentioned earlier, shadow stack get new page table encodings and thus have
+some special properties assigned to them and instructions that operate on them
+as below:
+
+- Regular stores to shadow stack memory raises access store faults. This way
+  shadow stack memory is protected from stray inadvertant writes.
+
+- Regular loads to shadow stack memory are allowed. This allows stack trace
+  utilities or backtrace functions to read true callstack (not tampered).
+
+- Only shadow stack instructions can generate shadow stack load or shadow stack
+  store.
+
+- Shadow stack load / shadow stack store on read-only memory raises AMO/store
+  page fault. Thus both ``sspush x1/x5`` and ``sspopchk x1/x5`` will raise AMO/
+  store page fault. This simplies COW handling in kernel During fork, kernel
+  can convert shadow stack pages into read-only memory (as it does for regular
+  read-write memory) and as soon as subsequent ``sspush`` or ``sspopchk`` in
+  userspace is encountered, then kernel can perform COW.
+
+- Shadow stack load / shadow stack store on read-write, read-write-execute
+  memory raises an access fault. This is a fatal condition because shadow stack
+  should never be operating on read-write, read-write-execute memory.
+
+3. ELF and psABI
+-----------------
+
+Toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_BCFI` for property
+:c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in notes section of the object file.
+
+4. Linux enabling
+------------------
+
+User space programs can have multiple shared objects loaded in its address space
+and it's a difficult task to make sure all the dependencies have been compiled
+with support of shadow stack. Thus it's left to dynamic loader to enable
+shadow stack for the program.
+
+5. prctl() enabling
+--------------------
+
+:c:macro:`PR_SET_SHADOW_STACK_STATUS` / :c:macro:`PR_GET_SHADOW_STACK_STATUS` /
+:c:macro:`PR_LOCK_SHADOW_STACK_STATUS` are three prctls added to manage shadow
+stack enabling for tasks. prctls are arch agnostic and returns -EINVAL on other
+arches.
+
+* prctl(PR_SET_SHADOW_STACK_STATUS, unsigned long arg)
+
+If arg1 :c:macro:`PR_SHADOW_STACK_ENABLE` and if CPU supports ``zicfiss`` then
+kernel will enable shadow stack for the task. Dynamic loader can issue this
+:c:macro:`prctl` once it has determined that all the objects loaded in address
+space have support for shadow stack. Additionally if there is a
+:c:macro:`dlopen` to an object which wasn't compiled with ``zicfiss``, dynamic
+loader can issue this prctl with arg1 set to 0 (i.e.
+:c:macro:`PR_SHADOW_STACK_ENABLE` being clear)
+
+* prctl(PR_GET_SHADOW_STACK_STATUS, unsigned long *arg)
+
+Returns current status of indirect branch tracking. If enabled it'll return
+:c:macro:`PR_SHADOW_STACK_ENABLE`.
+
+* prctl(PR_LOCK_SHADOW_STACK_STATUS, unsigned long arg)
+
+Locks current status of shadow stack enabling on the task. User space may want
+to run with strict security posture and wouldn't want loading of objects
+without ``zicfiss`` support in it and thus would want to disallow disabling of
+shadow stack on current task. In that case user space can use this prctl to
+lock current settings.
+
+5. violations related to returns with shadow stack enabled
+-----------------------------------------------------------
+
+Pertaining to shadow stack, CPU raises software check exception in following
+condition:
+
+- On execution of ``sspopchk x1/x5``, ``x1/x5`` didn't match top of shadow
+  stack. If mismatch happens then cpu does ``*tval = 3`` and raise software
+  check exception.
+
+Linux kernel will treat this as :c:macro:`SIGSEV`` with code =
+:c:macro:`SEGV_CPERR` and follow normal course of signal delivery.
+
+6. Shadow stack tokens
+-----------------------
+Regular stores on shadow stacks are not allowed and thus can't be tampered
+with via arbitrary stray writes due to bugs. Method of pivoting / switching to
+shadow stack is simply writing to csr ``CSR_SSP`` changes active shadow stack.
+This can be problematic because usually value to be written to ``CSR_SSP`` will
+be loaded somewhere in writeable memory and thus allows an adversary to
+corruption bug in software to pivot to an any address in shadow stack range.
+Shadow stack tokens can help mitigate this problem by making sure that:
+
+- When software is switching away from a shadow stack, shadow stack pointer
+  should be saved on shadow stack itself and call it ``shadow stack token``
+
+- When software is switching to a shadow stack, it should read the
+  ``shadow stack token`` from shadow stack pointer and verify that
+  ``shadow stack token`` itself is pointer to shadow stack itself.
+
+- Once the token verification is done, software can perform the write to
+  ``CSR_SSP`` to switch shadow stack.
+
+Here software can be user mode task runtime itself which is managing various
+contexts as part of single thread. Software can be kernel as well when kernel
+has to deliver a signal to user task and must save shadow stack pointer. Kernel
+can perform similar procedure by saving a token on user shadow stack itself.
+This way whenever :c:macro:`sigreturn` happens, kernel can read the token and
+verify the token and then switch to shadow stack. Using this mechanism, kernel
+helps user task so that any corruption issue in user task is not exploited by
+adversary by arbitrarily using :c:macro:`sigreturn`. Adversary will have to
+make sure that there is a ``shadow stack token`` in addition to invoking
+:c:macro:`sigreturn`
+
+7. Signal shadow stack
+-----------------------
+Following structure has been added to sigcontext for RISC-V::
+
+    struct __sc_riscv_cfi_state {
+        unsigned long ss_ptr;
+    };
+
+As part of signal delivery, shadow stack token is saved on current shadow stack
+itself and updated pointer is saved away in :c:macro:`ss_ptr` field in
+:c:macro:`__sc_riscv_cfi_state` under :c:macro:`sigcontext`. Existing shadow
+stack allocation is used for signal delivery. During :c:macro:`sigreturn`,
+kernel will obtain :c:macro:`ss_ptr` from :c:macro:`sigcontext` and verify the
+saved token on shadow stack itself and switch shadow stack.

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (31 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 32/33] riscv: Documentation for shadow stack on riscv Deepak Gupta
@ 2024-10-08 22:37 ` Deepak Gupta
  2024-10-11  5:44   ` Zong Li
  2024-10-09 11:05 ` [PATCH v6 00/33] riscv control-flow integrity for usermode Mark Brown
  33 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 22:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, broonie, rick.p.edgecombe,
	Deepak Gupta

Adds kselftest for RISC-V control flow integrity implementation for user
mode. There is not a lot going on in kernel for enabling landing pad for
user mode. cfi selftest are intended to be compiled with zicfilp and
zicfiss enabled compiler. Thus kselftest simply checks if landing pad and
shadow stack for the binary and process are enabled or not. selftest then
register a signal handler for SIGSEGV. Any control flow violation are
reported as SIGSEGV with si_code = SEGV_CPERR. Test will fail on receiving
any SEGV_CPERR. Shadow stack part has more changes in kernel and thus there
are separate tests for that

- Exercise `map_shadow_stack` syscall
- `fork` test to make sure COW works for shadow stack pages
- gup tests
  Kernel uses FOLL_FORCE when access happens to memory via
  /proc/<pid>/mem. Not breaking that for shadow stack.
- signal test. Make sure signal delivery results in token creation on
  shadow stack and consumes (and verifies) token on sigreturn
- shadow stack protection test. attempts to write using regular store
  instruction on shadow stack memory must result in access faults

Test outut
==========

"""
TAP version 13
1..5
  This is to ensure shadow stack is indeed enabled and working
  This is to ensure shadow stack is indeed enabled and working
ok 1 shstk fork test
ok 2 map shadow stack syscall
ok 3 shadow stack gup tests
ok 4 shadow stack signal tests
ok 5 memory protections of shadow stack memory
"""

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 tools/testing/selftests/riscv/Makefile             |   2 +-
 tools/testing/selftests/riscv/cfi/.gitignore       |   3 +
 tools/testing/selftests/riscv/cfi/Makefile         |  10 +
 tools/testing/selftests/riscv/cfi/cfi_rv_test.h    |  84 +++++
 tools/testing/selftests/riscv/cfi/riscv_cfi_test.c |  78 +++++
 tools/testing/selftests/riscv/cfi/shadowstack.c    | 373 +++++++++++++++++++++
 tools/testing/selftests/riscv/cfi/shadowstack.h    |  37 ++
 7 files changed, 586 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile
index 7ce03d832b64..6e142fe004ab 100644
--- a/tools/testing/selftests/riscv/Makefile
+++ b/tools/testing/selftests/riscv/Makefile
@@ -5,7 +5,7 @@
 ARCH ?= $(shell uname -m 2>/dev/null || echo not)
 
 ifneq (,$(filter $(ARCH),riscv))
-RISCV_SUBTARGETS ?= hwprobe vector mm sigreturn
+RISCV_SUBTARGETS ?= hwprobe vector mm sigreturn cfi
 else
 RISCV_SUBTARGETS :=
 endif
diff --git a/tools/testing/selftests/riscv/cfi/.gitignore b/tools/testing/selftests/riscv/cfi/.gitignore
new file mode 100644
index 000000000000..82545863bac6
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/.gitignore
@@ -0,0 +1,3 @@
+cfitests
+riscv_cfi_test
+shadowstack
diff --git a/tools/testing/selftests/riscv/cfi/Makefile b/tools/testing/selftests/riscv/cfi/Makefile
new file mode 100644
index 000000000000..b65f7ff38a32
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/Makefile
@@ -0,0 +1,10 @@
+CFLAGS += -I$(top_srcdir)/tools/include
+
+CFLAGS += -march=rv64gc_zicfilp_zicfiss
+
+TEST_GEN_PROGS := cfitests
+
+include ../../lib.mk
+
+$(OUTPUT)/cfitests: riscv_cfi_test.c shadowstack.c
+	$(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^
diff --git a/tools/testing/selftests/riscv/cfi/cfi_rv_test.h b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h
new file mode 100644
index 000000000000..0fefdc33f71e
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef SELFTEST_RISCV_CFI_H
+#define SELFTEST_RISCV_CFI_H
+#include <stddef.h>
+#include <sys/types.h>
+#include "shadowstack.h"
+
+#define RISCV_CFI_SELFTEST_COUNT RISCV_SHADOW_STACK_TESTS
+
+#define CHILD_EXIT_CODE_SSWRITE		10
+#define CHILD_EXIT_CODE_SIG_TEST	11
+
+#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)			\
+({									\
+	register long _num  __asm__ ("a7") = (num);			\
+	register long _arg1 __asm__ ("a0") = (long)(arg1);		\
+	register long _arg2 __asm__ ("a1") = (long)(arg2);		\
+	register long _arg3 __asm__ ("a2") = (long)(arg3);		\
+	register long _arg4 __asm__ ("a3") = (long)(arg4);		\
+	register long _arg5 __asm__ ("a4") = (long)(arg5);		\
+									\
+	__asm__ volatile(						\
+		"ecall\n"						\
+		: "+r"							\
+		(_arg1)							\
+		: "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5),	\
+		  "r"(_num)						\
+		: "memory", "cc"					\
+	);								\
+	_arg1;								\
+})
+
+#define my_syscall3(num, arg1, arg2, arg3)				\
+({									\
+	register long _num  __asm__ ("a7") = (num);			\
+	register long _arg1 __asm__ ("a0") = (long)(arg1);		\
+	register long _arg2 __asm__ ("a1") = (long)(arg2);		\
+	register long _arg3 __asm__ ("a2") = (long)(arg3);		\
+									\
+	__asm__ volatile(						\
+		"ecall\n"						\
+		: "+r" (_arg1)						\
+		: "r"(_arg2), "r"(_arg3),				\
+		  "r"(_num)						\
+		: "memory", "cc"					\
+	);								\
+	_arg1;								\
+})
+
+#ifndef __NR_prctl
+#define __NR_prctl 167
+#endif
+
+#ifndef __NR_map_shadow_stack
+#define __NR_map_shadow_stack 453
+#endif
+
+#define CSR_SSP 0x011
+
+#ifdef __ASSEMBLY__
+#define __ASM_STR(x)    x
+#else
+#define __ASM_STR(x)    #x
+#endif
+
+#define csr_read(csr)							\
+({									\
+	register unsigned long __v;					\
+	__asm__ __volatile__ ("csrr %0, " __ASM_STR(csr)		\
+				: "=r" (__v) :				\
+				: "memory");				\
+	__v;								\
+})
+
+#define csr_write(csr, val)						\
+({									\
+	unsigned long __v = (unsigned long) (val);			\
+	__asm__ __volatile__ ("csrw " __ASM_STR(csr) ", %0"		\
+				: : "rK" (__v)				\
+				: "memory");				\
+})
+
+#endif
diff --git a/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c b/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c
new file mode 100644
index 000000000000..720a001f7c31
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "../../kselftest.h"
+#include <signal.h>
+#include <asm/ucontext.h>
+#include <linux/prctl.h>
+#include "cfi_rv_test.h"
+
+/* do not optimize cfi related test functions */
+#pragma GCC push_options
+#pragma GCC optimize("O0")
+
+void sigsegv_handler(int signum, siginfo_t *si, void *uc)
+{
+	struct ucontext *ctx = (struct ucontext *) uc;
+
+	if (si->si_code == SEGV_CPERR) {
+		ksft_print_msg("Control flow violation happened somewhere\n");
+		ksft_print_msg("PC where violation happened %lx\n", ctx->uc_mcontext.gregs[0]);
+		exit(-1);
+	}
+
+	/* all other cases are expected to be of shadow stack write case */
+	exit(CHILD_EXIT_CODE_SSWRITE);
+}
+
+bool register_signal_handler(void)
+{
+	struct sigaction sa = {};
+
+	sa.sa_sigaction = sigsegv_handler;
+	sa.sa_flags = SA_SIGINFO;
+	if (sigaction(SIGSEGV, &sa, NULL)) {
+		ksft_print_msg("Registering signal handler for landing pad violation failed\n");
+		return false;
+	}
+
+	return true;
+}
+
+int main(int argc, char *argv[])
+{
+	int ret = 0;
+	unsigned long lpad_status = 0, ss_status = 0;
+
+	ksft_print_header();
+
+	ksft_print_msg("Starting risc-v tests\n");
+
+	/*
+	 * Landing pad test. Not a lot of kernel changes to support landing
+	 * pad for user mode except lighting up a bit in senvcfg via a prctl
+	 * Enable landing pad through out the execution of test binary
+	 */
+	ret = my_syscall5(__NR_prctl, PR_GET_INDIR_BR_LP_STATUS, &lpad_status, 0, 0, 0);
+	if (ret)
+		ksft_exit_fail_msg("Get landing pad status failed with %d\n", ret);
+
+	if (!(lpad_status & PR_INDIR_BR_LP_ENABLE))
+		ksft_exit_fail_msg("Landing pad is not enabled, should be enabled via glibc\n");
+
+	ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
+	if (ret)
+		ksft_exit_fail_msg("Get shadow stack failed with %d\n", ret);
+
+	if (!(ss_status & PR_SHADOW_STACK_ENABLE))
+		ksft_exit_fail_msg("Shadow stack is not enabled, should be enabled via glibc\n");
+
+	if (!register_signal_handler())
+		ksft_exit_fail_msg("Registering signal handler for SIGSEGV failed\n");
+
+	ksft_print_msg("Landing pad and shadow stack are enabled for binary\n");
+	execute_shadow_stack_tests();
+
+	return 0;
+}
+
+#pragma GCC pop_options
diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.c b/tools/testing/selftests/riscv/cfi/shadowstack.c
new file mode 100644
index 000000000000..9d5301914578
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/shadowstack.c
@@ -0,0 +1,373 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "../../kselftest.h"
+#include <sys/wait.h>
+#include <signal.h>
+#include <fcntl.h>
+#include <asm-generic/unistd.h>
+#include <sys/mman.h>
+#include "shadowstack.h"
+#include "cfi_rv_test.h"
+
+/* do not optimize shadow stack related test functions */
+#pragma GCC push_options
+#pragma GCC optimize("O0")
+
+void zar(void)
+{
+	unsigned long ssp = 0;
+
+	ssp = csr_read(CSR_SSP);
+	ksft_print_msg("Spewing out shadow stack ptr: %lx\n"
+			"  This is to ensure shadow stack is indeed enabled and working\n",
+			ssp);
+}
+
+void bar(void)
+{
+	zar();
+}
+
+void foo(void)
+{
+	bar();
+}
+
+void zar_child(void)
+{
+	unsigned long ssp = 0;
+
+	ssp = csr_read(CSR_SSP);
+	ksft_print_msg("Spewing out shadow stack ptr: %lx\n"
+			"  This is to ensure shadow stack is indeed enabled and working\n",
+			ssp);
+}
+
+void bar_child(void)
+{
+	zar_child();
+}
+
+void foo_child(void)
+{
+	bar_child();
+}
+
+typedef void (call_func_ptr)(void);
+/*
+ * call couple of functions to test push pop.
+ */
+int shadow_stack_call_tests(call_func_ptr fn_ptr, bool parent)
+{
+	ksft_print_msg("Exercising dummy calls for sspush and sspopchk in"
+			" context of %s\n", parent ? "parent" : "child");
+
+	(fn_ptr)();
+
+	return 0;
+}
+
+/* forks a thread, and ensure shadow stacks fork out */
+bool shadow_stack_fork_test(unsigned long test_num, void *ctx)
+{
+	int pid = 0, child_status = 0, parent_pid = 0, ret = 0;
+	unsigned long ss_status = 0;
+
+	ksft_print_msg("Exercising shadow stack fork test\n");
+
+	ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
+	if (ret) {
+		ksft_exit_skip("Shadow stack get status prctl failed with errorcode %d\n", ret);
+		return false;
+	}
+
+	if (!(ss_status & PR_SHADOW_STACK_ENABLE))
+		ksft_exit_skip("Shadow stack is not enabled, should be enabled via glibc\n");
+
+	parent_pid = getpid();
+	pid = fork();
+
+	if (pid) {
+		ksft_print_msg("Parent pid %d and child pid %d\n", parent_pid, pid);
+		shadow_stack_call_tests(&foo, true);
+	} else
+		shadow_stack_call_tests(&foo_child, false);
+
+	if (pid) {
+		ksft_print_msg("Waiting on child to finish\n");
+		wait(&child_status);
+	} else {
+		/* exit child gracefully */
+		exit(0);
+	}
+
+	if (pid && WIFSIGNALED(child_status)) {
+		ksft_print_msg("Child faulted, fork test failed\n");
+		return false;
+	}
+
+	return true;
+}
+
+/* exercise `map_shadow_stack`, pivot to it and call some functions to ensure it works */
+#define SHADOW_STACK_ALLOC_SIZE 4096
+bool shadow_stack_map_test(unsigned long test_num, void *ctx)
+{
+	unsigned long shdw_addr;
+	int ret = 0;
+
+	ksft_print_msg("Exercising shadow stack map test\n");
+
+	shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
+
+	if (((long) shdw_addr) <= 0) {
+		ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
+		return false;
+	}
+
+	ret = munmap((void *) shdw_addr, SHADOW_STACK_ALLOC_SIZE);
+
+	if (ret) {
+		ksft_print_msg("munmap failed with error code %d\n", ret);
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * shadow stack protection tests. map a shadow stack and
+ * validate all memory protections work on it
+ */
+bool shadow_stack_protection_test(unsigned long test_num, void *ctx)
+{
+	unsigned long shdw_addr;
+	unsigned long *write_addr = NULL;
+	int ret = 0, pid = 0, child_status = 0;
+
+	ksft_print_msg("Exercising shadow stack protection test\n");
+
+	shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
+
+	if (((long) shdw_addr) <= 0) {
+		ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
+		return false;
+	}
+
+	write_addr = (unsigned long *) shdw_addr;
+	pid = fork();
+
+	/* no child was created, return false */
+	if (pid == -1)
+		return false;
+
+	/*
+	 * try to perform a store from child on shadow stack memory
+	 * it should result in SIGSEGV
+	 */
+	if (!pid) {
+		/* below write must lead to SIGSEGV */
+		*write_addr = 0xdeadbeef;
+	} else {
+		wait(&child_status);
+	}
+
+	/* test fail, if 0xdeadbeef present on shadow stack address */
+	if (*write_addr == 0xdeadbeef) {
+		ksft_print_msg("Write suceeded on shadow stack memory, shadow stack protection test"
+		" failed\n");
+		return false;
+	}
+
+	/* if child reached here, then fail */
+	if (!pid) {
+		ksft_print_msg("Shadow stack protection test: child reached unreachable state\n");
+		return false;
+	}
+
+	/* if child exited via signal handler but not for write on ss */
+	if (WIFEXITED(child_status) &&
+		WEXITSTATUS(child_status) != CHILD_EXIT_CODE_SSWRITE) {
+		ksft_print_msg("Shadow stack protection test: child wasn't signaled for write on"
+		" shadow stack\n");
+		return false;
+	}
+
+	ret = munmap(write_addr, SHADOW_STACK_ALLOC_SIZE);
+	if (ret) {
+		ksft_print_msg("Shadow stack protection test: munmap failed with error code %d\n",
+		ret);
+		return false;
+	}
+
+	return true;
+}
+
+#define SS_MAGIC_WRITE_VAL 0xbeefdead
+
+int gup_tests(int mem_fd, unsigned long *shdw_addr)
+{
+	unsigned long val = 0;
+
+	lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET);
+	if (read(mem_fd, &val, sizeof(val)) < 0) {
+		ksft_print_msg("Reading shadow stack mem via gup failed\n");
+		return 1;
+	}
+
+	val = SS_MAGIC_WRITE_VAL;
+	lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET);
+	if (write(mem_fd, &val, sizeof(val)) < 0) {
+		ksft_print_msg("Writing shadow stack mem via gup failed\n");
+		return 1;
+	}
+
+	if (*shdw_addr != SS_MAGIC_WRITE_VAL) {
+		ksft_print_msg("GUP write to shadow stack memory failed\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+bool shadow_stack_gup_tests(unsigned long test_num, void *ctx)
+{
+	unsigned long shdw_addr = 0;
+	unsigned long *write_addr = NULL;
+	int fd = 0;
+	bool ret = false;
+
+	ksft_print_msg("Exercising shadow stack gup tests\n");
+	shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
+
+	if (((long) shdw_addr) <= 0) {
+		ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
+		return false;
+	}
+
+	write_addr = (unsigned long *) shdw_addr;
+
+	fd = open("/proc/self/mem", O_RDWR);
+	if (fd == -1)
+		return false;
+
+	if (gup_tests(fd, write_addr)) {
+		ksft_print_msg("gup tests failed\n");
+		goto out;
+	}
+
+	ret = true;
+out:
+	if (shdw_addr && munmap(write_addr, SHADOW_STACK_ALLOC_SIZE)) {
+		ksft_print_msg("munmap failed with error code %d\n", ret);
+		ret = false;
+	}
+
+	return ret;
+}
+
+volatile bool break_loop;
+
+void sigusr1_handler(int signo)
+{
+	break_loop = true;
+}
+
+bool sigusr1_signal_test(void)
+{
+	struct sigaction sa = {};
+
+	sa.sa_handler = sigusr1_handler;
+	sa.sa_flags = 0;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(SIGUSR1, &sa, NULL)) {
+		ksft_print_msg("Registering signal handler for SIGUSR1 failed\n");
+		return false;
+	}
+
+	return true;
+}
+/*
+ * shadow stack signal test. shadow stack must be enabled.
+ * register a signal, fork another thread which is waiting
+ * on signal. Send a signal from parent to child, verify
+ * that signal was received by child. If not test fails
+ */
+bool shadow_stack_signal_test(unsigned long test_num, void *ctx)
+{
+	int pid = 0, child_status = 0, ret = 0;
+	unsigned long ss_status = 0;
+
+	ksft_print_msg("Exercising shadow stack signal test\n");
+
+	ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
+	if (ret) {
+		ksft_print_msg("Shadow stack get status prctl failed with errorcode %d\n", ret);
+		return false;
+	}
+
+	if (!(ss_status & PR_SHADOW_STACK_ENABLE))
+		ksft_print_msg("Shadow stack is not enabled, should be enabled via glibc\n");
+
+	/* this should be caught by signal handler and do an exit */
+	if (!sigusr1_signal_test()) {
+		ksft_print_msg("Registering sigusr1 handler failed\n");
+		exit(-1);
+	}
+
+	pid = fork();
+
+	if (pid == -1) {
+		ksft_print_msg("Signal test: fork failed\n");
+		goto out;
+	}
+
+	if (pid == 0) {
+		while (!break_loop)
+			sleep(1);
+
+		exit(11);
+		/* child shouldn't go beyond here */
+	}
+
+	/* send SIGUSR1 to child */
+	kill(pid, SIGUSR1);
+	wait(&child_status);
+
+out:
+
+	return (WIFEXITED(child_status) &&
+			WEXITSTATUS(child_status) == 11);
+}
+
+int execute_shadow_stack_tests(void)
+{
+	int ret = 0;
+	unsigned long test_count = 0;
+	unsigned long shstk_status = 0;
+	bool test_pass = false;
+
+	ksft_print_msg("Executing RISC-V shadow stack self tests\n");
+	ksft_set_plan(RISCV_SHADOW_STACK_TESTS);
+
+	ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &shstk_status, 0, 0, 0);
+
+	if (ret != 0)
+		ksft_exit_fail_msg("Get shadow stack status failed with %d\n", ret);
+
+	/*
+	 * If we are here that means get shadow stack status succeeded and
+	 * thus shadow stack support is baked in the kernel.
+	 */
+	while (test_count < ARRAY_SIZE(shstk_tests)) {
+		test_pass = (*shstk_tests[test_count].t_func)(test_count, NULL);
+		ksft_test_result(test_pass, shstk_tests[test_count].name);
+		test_count++;
+	}
+
+	ksft_finished();
+
+	return 0;
+}
+
+#pragma GCC pop_options
diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.h b/tools/testing/selftests/riscv/cfi/shadowstack.h
new file mode 100644
index 000000000000..b43e74136a26
--- /dev/null
+++ b/tools/testing/selftests/riscv/cfi/shadowstack.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef SELFTEST_SHADOWSTACK_TEST_H
+#define SELFTEST_SHADOWSTACK_TEST_H
+#include <stddef.h>
+#include <linux/prctl.h>
+
+/*
+ * a cfi test returns true for success or false for fail
+ * takes a number for test number to index into array and void pointer.
+ */
+typedef bool (*shstk_test_func)(unsigned long test_num, void *);
+
+struct shadow_stack_tests {
+	char *name;
+	shstk_test_func t_func;
+};
+
+bool shadow_stack_fork_test(unsigned long test_num, void *ctx);
+bool shadow_stack_map_test(unsigned long test_num, void *ctx);
+bool shadow_stack_protection_test(unsigned long test_num, void *ctx);
+bool shadow_stack_gup_tests(unsigned long test_num, void *ctx);
+bool shadow_stack_signal_test(unsigned long test_num, void *ctx);
+
+static struct shadow_stack_tests shstk_tests[] = {
+	{ "shstk fork test\n", shadow_stack_fork_test },
+	{ "map shadow stack syscall\n", shadow_stack_map_test },
+	{ "shadow stack gup tests\n", shadow_stack_gup_tests },
+	{ "shadow stack signal tests\n", shadow_stack_signal_test},
+	{ "memory protections of shadow stack memory\n", shadow_stack_protection_test }
+};
+
+#define RISCV_SHADOW_STACK_TESTS ARRAY_SIZE(shstk_tests)
+
+int execute_shadow_stack_tests(void);
+
+#endif

-- 
2.45.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone
  2024-10-08 22:36 ` [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone Deepak Gupta
@ 2024-10-08 22:55   ` Edgecombe, Rick P
  2024-10-08 23:17     ` Deepak Gupta
  2024-10-09 10:25     ` Mark Brown
  0 siblings, 2 replies; 52+ messages in thread
From: Edgecombe, Rick P @ 2024-10-08 22:55 UTC (permalink / raw)
  To: corbet@lwn.net, robh@kernel.org, lorenzo.stoakes@oracle.com,
	dave.hansen@linux.intel.com, debug@rivosinc.com, vbabka@suse.cz,
	brauner@kernel.org, akpm@linux-foundation.org, palmer@dabbelt.com,
	mingo@redhat.com, paul.walmsley@sifive.com,
	Liam.Howlett@oracle.com, tglx@linutronix.de,
	aou@eecs.berkeley.edu, oleg@redhat.com, krzk+dt@kernel.org,
	conor@kernel.org, ebiederm@xmission.com, hpa@zytor.com,
	peterz@infradead.org, arnd@arndb.de, bp@alien8.de,
	kees@kernel.org, x86@kernel.org, shuah@kernel.org
  Cc: broonie@kernel.org, jim.shu@sifive.com, alistair.francis@wdc.com,
	cleger@rivosinc.com, kito.cheng@sifive.com,
	linux-kernel@vger.kernel.org, samitolvanen@google.com,
	evan@rivosinc.com, linux-mm@kvack.org, linux-arch@vger.kernel.org,
	atishp@rivosinc.com, andybnac@gmail.com,
	devicetree@vger.kernel.org, charlie@rivosinc.com,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, richard.henderson@linaro.org,
	linux-riscv@lists.infradead.org, alexghiti@rivosinc.com

On Tue, 2024-10-08 at 15:36 -0700, Deepak Gupta wrote:
> +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
> +					   const struct kernel_clone_args *args)
> +{
> +	unsigned long addr, size;
> +
> +	/* If shadow stack is not supported, return 0 */
> +	if (!cpu_supports_shadow_stack())
> +		return 0;
> +
> +	/*
> +	 * If shadow stack is not enabled on the new thread, skip any
> +	 * switch to a new shadow stack.
> +	 */
> +	if (!is_shstk_enabled(tsk))
> +		return 0;
> +
> +	/*
> +	 * For CLONE_VFORK the child will share the parents shadow stack.
> +	 * Set base = 0 and size = 0, this is special means to track this state
> +	 * so the freeing logic run for child knows to leave it alone.
> +	 */
> +	if (args->flags & CLONE_VFORK) {
> +		set_shstk_base(tsk, 0, 0);
> +		return 0;
> +	}
> +
> +	/*
> +	 * For !CLONE_VM the child will use a copy of the parents shadow
> +	 * stack.
> +	 */
> +	if (!(args->flags & CLONE_VM))
> +		return 0;
> +
> +	/*
> +	 * reaching here means, CLONE_VM was specified and thus a separate shadow
> +	 * stack is needed for new cloned thread. Note: below allocation is happening
> +	 * using current mm.
> +	 */
> +	size = calc_shstk_size(args->stack_size);
> +	addr = allocate_shadow_stack(0, size, 0, false);
> +	if (IS_ERR_VALUE(addr))
> +		return addr;
> +
> +	set_shstk_base(tsk, addr, size);
> +
> +	return addr + size;
> +}

A lot of this patch and the previous one is similar to x86's and arm's. It great
that we can have consistency around this behavior.

There might be enough consistency to refactor some of the arch code into a
kernel/shstk.c.

Should we try?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone
  2024-10-08 22:55   ` Edgecombe, Rick P
@ 2024-10-08 23:17     ` Deepak Gupta
  2024-10-08 23:31       ` Edgecombe, Rick P
  2024-10-09 10:25     ` Mark Brown
  1 sibling, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-08 23:17 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: corbet@lwn.net, robh@kernel.org, lorenzo.stoakes@oracle.com,
	dave.hansen@linux.intel.com, vbabka@suse.cz, brauner@kernel.org,
	akpm@linux-foundation.org, palmer@dabbelt.com, mingo@redhat.com,
	paul.walmsley@sifive.com, Liam.Howlett@oracle.com,
	tglx@linutronix.de, aou@eecs.berkeley.edu, oleg@redhat.com,
	krzk+dt@kernel.org, conor@kernel.org, ebiederm@xmission.com,
	hpa@zytor.com, peterz@infradead.org, arnd@arndb.de, bp@alien8.de,
	kees@kernel.org, x86@kernel.org, shuah@kernel.org,
	broonie@kernel.org, jim.shu@sifive.com, alistair.francis@wdc.com,
	cleger@rivosinc.com, kito.cheng@sifive.com,
	linux-kernel@vger.kernel.org, samitolvanen@google.com,
	evan@rivosinc.com, linux-mm@kvack.org, linux-arch@vger.kernel.org,
	atishp@rivosinc.com, andybnac@gmail.com,
	devicetree@vger.kernel.org, charlie@rivosinc.com,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, richard.henderson@linaro.org,
	linux-riscv@lists.infradead.org, alexghiti@rivosinc.com

On Tue, Oct 08, 2024 at 10:55:29PM +0000, Edgecombe, Rick P wrote:
>On Tue, 2024-10-08 at 15:36 -0700, Deepak Gupta wrote:
>> +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
>> +					   const struct kernel_clone_args *args)
>> +{
>> +	unsigned long addr, size;
>> +
>> +	/* If shadow stack is not supported, return 0 */
>> +	if (!cpu_supports_shadow_stack())
>> +		return 0;
>> +
>> +	/*
>> +	 * If shadow stack is not enabled on the new thread, skip any
>> +	 * switch to a new shadow stack.
>> +	 */
>> +	if (!is_shstk_enabled(tsk))
>> +		return 0;
>> +
>> +	/*
>> +	 * For CLONE_VFORK the child will share the parents shadow stack.
>> +	 * Set base = 0 and size = 0, this is special means to track this state
>> +	 * so the freeing logic run for child knows to leave it alone.
>> +	 */
>> +	if (args->flags & CLONE_VFORK) {
>> +		set_shstk_base(tsk, 0, 0);
>> +		return 0;
>> +	}
>> +
>> +	/*
>> +	 * For !CLONE_VM the child will use a copy of the parents shadow
>> +	 * stack.
>> +	 */
>> +	if (!(args->flags & CLONE_VM))
>> +		return 0;
>> +
>> +	/*
>> +	 * reaching here means, CLONE_VM was specified and thus a separate shadow
>> +	 * stack is needed for new cloned thread. Note: below allocation is happening
>> +	 * using current mm.
>> +	 */
>> +	size = calc_shstk_size(args->stack_size);
>> +	addr = allocate_shadow_stack(0, size, 0, false);
>> +	if (IS_ERR_VALUE(addr))
>> +		return addr;
>> +
>> +	set_shstk_base(tsk, addr, size);
>> +
>> +	return addr + size;
>> +}
>
>A lot of this patch and the previous one is similar to x86's and arm's. It great
>that we can have consistency around this behavior.
>
>There might be enough consistency to refactor some of the arch code into a
>kernel/shstk.c.
>
>Should we try?

Yeah you're right. Honestly, I've been shameless in adapting most of the flows
from x86 `shstk.c` for risc-v. So thank you for that.

Now that we've `ARCH_HAS_USER_SHADOW_STACK` part of multiple patch series (riscv
shadowstack, clone3 and I think arm64 gcs series as well). It's probably the
appropriate time to find common grounds.

This is what I suggest

- move most of the common/arch agnostic shadow stack stuff in kernel/shstk.c
   This gets part of compile if `ARCH_HAS_USER_SHADOW_STACK` is enabled/selected.

- allow arch specific branch out guard checks for "if cpu supports", "is shadow stack
   enabled on the task_struct" (I expect each arch layout of task_struct will be
   different, no point finding common ground there), etc.

I think it's worth a try. 
If you already don't have patches, I'll spend some time to see what it takes to
converge in my next version. If I end up into some roadblock, will use this thread
for further discussion.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone
  2024-10-08 23:17     ` Deepak Gupta
@ 2024-10-08 23:31       ` Edgecombe, Rick P
  0 siblings, 0 replies; 52+ messages in thread
From: Edgecombe, Rick P @ 2024-10-08 23:31 UTC (permalink / raw)
  To: debug@rivosinc.com
  Cc: kito.cheng@sifive.com, tglx@linutronix.de,
	lorenzo.stoakes@oracle.com, linux-arch@vger.kernel.org,
	charlie@rivosinc.com, linux-fsdevel@vger.kernel.org,
	samitolvanen@google.com, devicetree@vger.kernel.org,
	peterz@infradead.org, corbet@lwn.net, kees@kernel.org,
	alistair.francis@wdc.com, broonie@kernel.org, andybnac@gmail.com,
	krzk+dt@kernel.org, palmer@dabbelt.com, x86@kernel.org,
	bp@alien8.de, linux-kernel@vger.kernel.org,
	linux-riscv@lists.infradead.org, aou@eecs.berkeley.edu,
	arnd@arndb.de, jim.shu@sifive.com, vbabka@suse.cz,
	shuah@kernel.org, Liam.Howlett@oracle.com, oleg@redhat.com,
	alexghiti@rivosinc.com, ebiederm@xmission.com,
	atishp@rivosinc.com, richard.henderson@linaro.org,
	cleger@rivosinc.com, brauner@kernel.org, hpa@zytor.com,
	mingo@redhat.com, robh@kernel.org,
	linux-kselftest@vger.kernel.org, paul.walmsley@sifive.com,
	linux-mm@kvack.org, evan@rivosinc.com, conor@kernel.org,
	akpm@linux-foundation.org, linux-doc@vger.kernel.org,
	dave.hansen@linux.intel.com

On Tue, 2024-10-08 at 16:17 -0700, Deepak Gupta wrote:
> Yeah you're right. Honestly, I've been shameless in adapting most of the flows
> from x86 `shstk.c` for risc-v. So thank you for that.

All good, glad we ended up with similar behavior.

> 
> Now that we've `ARCH_HAS_USER_SHADOW_STACK` part of multiple patch series (riscv
> shadowstack, clone3 and I think arm64 gcs series as well). It's probably the
> appropriate time to find common grounds.

There have been bugs in the similar bits of code. So will be nice to not have to
fix them in each arch too.

> 
> This is what I suggest
> 
> - move most of the common/arch agnostic shadow stack stuff in kernel/shstk.c
>    This gets part of compile if `ARCH_HAS_USER_SHADOW_STACK` is enabled/selected.

Yea, I guess we have commonality for (in x86 naming):
 - map_shadow_stack()
 - shstk_free()
 - shstk_alloc_thread_stack()
 - shstk_setup()

The signal part starts to diverge. Then I guess x86 has a different prctl
interface.

> 
> - allow arch specific branch out guard checks for "if cpu supports", "is shadow stack
>    enabled on the task_struct" (I expect each arch layout of task_struct will be
>    different, no point finding common ground there), etc.

Sure.

> 
> I think it's worth a try. 
> If you already don't have patches, I'll spend some time to see what it takes to
> converge in my next version. If I end up into some roadblock, will use this thread
> for further discussion.

Sounds good. I have not looked at it too much.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone
  2024-10-08 22:55   ` Edgecombe, Rick P
  2024-10-08 23:17     ` Deepak Gupta
@ 2024-10-09 10:25     ` Mark Brown
  1 sibling, 0 replies; 52+ messages in thread
From: Mark Brown @ 2024-10-09 10:25 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: corbet@lwn.net, robh@kernel.org, lorenzo.stoakes@oracle.com,
	dave.hansen@linux.intel.com, debug@rivosinc.com, vbabka@suse.cz,
	brauner@kernel.org, akpm@linux-foundation.org, palmer@dabbelt.com,
	mingo@redhat.com, paul.walmsley@sifive.com,
	Liam.Howlett@oracle.com, tglx@linutronix.de,
	aou@eecs.berkeley.edu, oleg@redhat.com, krzk+dt@kernel.org,
	conor@kernel.org, ebiederm@xmission.com, hpa@zytor.com,
	peterz@infradead.org, arnd@arndb.de, bp@alien8.de,
	kees@kernel.org, x86@kernel.org, shuah@kernel.org,
	jim.shu@sifive.com, alistair.francis@wdc.com, cleger@rivosinc.com,
	kito.cheng@sifive.com, linux-kernel@vger.kernel.org,
	samitolvanen@google.com, evan@rivosinc.com, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, atishp@rivosinc.com,
	andybnac@gmail.com, devicetree@vger.kernel.org,
	charlie@rivosinc.com, linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	richard.henderson@linaro.org, linux-riscv@lists.infradead.org,
	alexghiti@rivosinc.com

[-- Attachment #1: Type: text/plain, Size: 517 bytes --]

On Tue, Oct 08, 2024 at 10:55:29PM +0000, Edgecombe, Rick P wrote:

> A lot of this patch and the previous one is similar to x86's and arm's. It great
> that we can have consistency around this behavior.

> There might be enough consistency to refactor some of the arch code into a
> kernel/shstk.c.

> Should we try?

I think so - I think we discussed it before.  I was thinking of looking
at it once the clone3() stuff settles down, I don't want to trigger any
unneeded refectorings there and cause further delays.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking
  2024-10-08 22:37 ` [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking Deepak Gupta
@ 2024-10-09 11:03   ` Mark Brown
  0 siblings, 0 replies; 52+ messages in thread
From: Mark Brown @ 2024-10-09 11:03 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe

[-- Attachment #1: Type: text/plain, Size: 580 bytes --]

On Tue, Oct 08, 2024 at 03:37:00PM -0700, Deepak Gupta wrote:
> Three architectures (x86, aarch64, riscv) have support for indirect branch
> tracking feature in a very similar fashion. On a very high level, indirect
> branch tracking is a CPU feature where CPU tracks branches which uses
> memory operand to perform control transfer in program. As part of this
> tracking on indirect branches, CPU goes in a state where it expects a
> landing pad instr on target and if not found then CPU raises some fault
> (architecture dependent)

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 00/33] riscv control-flow integrity for usermode
  2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
                   ` (32 preceding siblings ...)
  2024-10-08 22:37 ` [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi Deepak Gupta
@ 2024-10-09 11:05 ` Mark Brown
  33 siblings, 0 replies; 52+ messages in thread
From: Mark Brown @ 2024-10-09 11:05 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe, David Hildenbrand,
	Carlos Bilbao, Samuel Holland, Andrew Jones, Conor Dooley,
	Andy Chiu

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

On Tue, Oct 08, 2024 at 03:36:42PM -0700, Deepak Gupta wrote:

> Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel
> CET [3] and branch target identification (BTI) [4] on arm.
> Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control
> stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack.

> x86 already supports shadow stack for user mode and arm64 support for GCS in
> usermode [7] is ongoing.

FWIW the arm64 support is now in -next, including these:

> Mark Brown (2):
>       mm: Introduce ARCH_HAS_USER_SHADOW_STACK
>       prctl: arch-agnostic prctl for shadow stack

shared changes to generic code.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma
  2024-10-08 22:36 ` [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma Deepak Gupta
@ 2024-10-09 11:11   ` Mark Brown
  0 siblings, 0 replies; 52+ messages in thread
From: Mark Brown @ 2024-10-09 11:11 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe

[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]

On Tue, Oct 08, 2024 at 03:36:44PM -0700, Deepak Gupta wrote:
> VM_SHADOW_STACK (alias to VM_HIGH_ARCH_5) is used to encode shadow stack
> VMA on three architectures (x86 shadow stack, arm GCS and RISC-V shadow
> stack). In case architecture doesn't implement shadow stack, it's VM_NONE
> Introducing a helper `is_shadow_stack_vma` to determine shadow stack vma
> or not.

Not that it makes any difference but the arm64 eneded up defining
VM_SHADOW_STACK to VM_HIGH_ARCH_6.

> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
>  mm/gup.c |  2 +-
>  mm/vma.h | 10 +++++++---
>  2 files changed, 8 insertions(+), 4 deletions(-)

There's another test of VM_SHADOW_STACK in mainline now, added by my
change df7e1286b1dc3d6c ("mm: care about shadow stack guard gap when
getting an unmapped area") - sorry, I should've remembered this change
from your series and pulled it into mine.

> @@ -387,7 +392,6 @@ static inline bool is_data_mapping(vm_flags_t flags)
>         return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE;
>  }
> 
> -
>  static inline void vma_iter_config(struct vma_iterator *vmi,
>                 unsigned long index, unsigned long last)

Unrelated whitespace change.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv
  2024-10-08 22:36 ` [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv Deepak Gupta
@ 2024-10-09 11:28   ` Mark Brown
  2024-10-29 22:06     ` Deepak Gupta
  0 siblings, 1 reply; 52+ messages in thread
From: Mark Brown @ 2024-10-09 11:28 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Tue, Oct 08, 2024 at 03:36:48PM -0700, Deepak Gupta wrote:

> riscv will need an implementation for exit_thread to clean up shadow stack
> when thread exits. If current thread had shadow stack enabled, shadow
> stack is allocated by default for any new thread.

FWIW both arm64 and x86 do this via deactivate_mm().  ISTR there's some
case where exit_thread() doesn't quite do the right thing but I can't
remember the specifics right now, possibly the vfork() case but ICBW?
In any case like Rick said factoring out the common patterns would be
good, keeping things aligned would support that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls
  2024-10-08 22:37 ` [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls Deepak Gupta
@ 2024-10-09 12:44   ` Mark Brown
  0 siblings, 0 replies; 52+ messages in thread
From: Mark Brown @ 2024-10-09 12:44 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe

[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]

On Tue, Oct 08, 2024 at 03:37:01PM -0700, Deepak Gupta wrote:

> +int arch_lock_shadow_stack_status(struct task_struct *task,
> +				unsigned long arg)
> +{
> +	/* If shtstk not supported or not enabled on task, nothing to lock here */
> +	if (!cpu_supports_shadow_stack() ||
> +		!is_shstk_enabled(task))
> +		return -EINVAL;
> +
> +	set_shstk_lock(task);
> +
> +	return 0;
> +}

This will lock the shadow stack settings regardless of the value of arg.
On arm64 the argument is a mask of bits to block changes to.  While for
RISC-V you only support enables so there's only one bit that'll actually
do anything portable code could in theory try to do something like
masking writes or pushes only and get surprised that disabling shadow
stack gets blocked.  For arm64 the implementaion accepts any possible
mask value, allowing for userspace to block enabling of any future
options that get added.  In theory someone might end up calling with a
value of 0 (eg, if there's a config option for the bitmask to lock and
they don't bother optimising out the syscall if the value is 0) which
would definitely break.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
  2024-10-08 22:36 ` [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE Deepak Gupta
@ 2024-10-09 13:36   ` Lorenzo Stoakes
  2024-10-10  0:02     ` Deepak Gupta
  0 siblings, 1 reply; 52+ messages in thread
From: Lorenzo Stoakes @ 2024-10-09 13:36 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Conor Dooley,
	Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, broonie, rick.p.edgecombe

On Tue, Oct 08, 2024 at 03:36:53PM -0700, Deepak Gupta wrote:
> `arch_calc_vm_prot_bits` is implemented on risc-v to return VM_READ |
> VM_WRITE if PROT_WRITE is specified. Similarly `riscv_sys_mmap` is
> updated to convert all incoming PROT_WRITE to (PROT_WRITE | PROT_READ).
> This is to make sure that any existing apps using PROT_WRITE still work.
>
> Earlier `protection_map[VM_WRITE]` used to pick read-write PTE encodings.
> Now `protection_map[VM_WRITE]` will always pick PAGE_SHADOWSTACK PTE
> encodings for shadow stack. Above changes ensure that existing apps
> continue to work because underneath kernel will be picking
> `protection_map[VM_WRITE|VM_READ]` PTE encodings.
>
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
>  arch/riscv/include/asm/mman.h    | 24 ++++++++++++++++++++++++
>  arch/riscv/include/asm/pgtable.h |  1 +
>  arch/riscv/kernel/sys_riscv.c    | 10 ++++++++++
>  arch/riscv/mm/init.c             |  2 +-
>  mm/mmap.c                        |  1 +
>  5 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/mman.h b/arch/riscv/include/asm/mman.h
> new file mode 100644
> index 000000000000..ef9fedf32546
> --- /dev/null
> +++ b/arch/riscv/include/asm/mman.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_MMAN_H__
> +#define __ASM_MMAN_H__
> +
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +#include <uapi/asm/mman.h>
> +
> +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> +	unsigned long pkey __always_unused)
> +{
> +	unsigned long ret = 0;
> +
> +	/*
> +	 * If PROT_WRITE was specified, force it to VM_READ | VM_WRITE.
> +	 * Only VM_WRITE means shadow stack.
> +	 */
> +	if (prot & PROT_WRITE)
> +		ret = (VM_READ | VM_WRITE);
> +	return ret;
> +}
> +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> +
> +#endif /* ! __ASM_MMAN_H__ */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index e79f15293492..4948a1f18ae8 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -177,6 +177,7 @@ extern struct pt_alloc_ops pt_ops __meminitdata;
>  #define PAGE_READ_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
>  #define PAGE_WRITE_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ |	\
>  					 _PAGE_EXEC | _PAGE_WRITE)
> +#define PAGE_SHADOWSTACK       __pgprot(_PAGE_BASE | _PAGE_WRITE)
>
>  #define PAGE_COPY		PAGE_READ
>  #define PAGE_COPY_EXEC		PAGE_READ_EXEC
> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
> index d77afe05578f..43a448bf254b 100644
> --- a/arch/riscv/kernel/sys_riscv.c
> +++ b/arch/riscv/kernel/sys_riscv.c
> @@ -7,6 +7,7 @@
>
>  #include <linux/syscalls.h>
>  #include <asm/cacheflush.h>
> +#include <asm-generic/mman-common.h>
>
>  static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>  			   unsigned long prot, unsigned long flags,
> @@ -16,6 +17,15 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>  	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
>  		return -EINVAL;
>
> +	/*
> +	 * If PROT_WRITE is specified then extend that to PROT_READ
> +	 * protection_map[VM_WRITE] is now going to select shadow stack encodings.
> +	 * So specifying PROT_WRITE actually should select protection_map [VM_WRITE | VM_READ]
> +	 * If user wants to create shadow stack then they should use `map_shadow_stack` syscall.
> +	 */
> +	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> +		prot |= PROT_READ;
> +
>  	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
>  			       offset >> (PAGE_SHIFT - page_shift_offset));
>  }
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 0e8c20adcd98..964810aeb405 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -326,7 +326,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
>  static const pgprot_t protection_map[16] = {
>  	[VM_NONE]					= PAGE_NONE,
>  	[VM_READ]					= PAGE_READ,
> -	[VM_WRITE]					= PAGE_COPY,
> +	[VM_WRITE]					= PAGE_SHADOWSTACK,
>  	[VM_WRITE | VM_READ]				= PAGE_COPY,
>  	[VM_EXEC]					= PAGE_EXEC,
>  	[VM_EXEC | VM_READ]				= PAGE_READ_EXEC,
> diff --git a/mm/mmap.c b/mm/mmap.c
> index dd4b35a25aeb..b56f1e8cbfc6 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -47,6 +47,7 @@
>  #include <linux/oom.h>
>  #include <linux/sched/mm.h>
>  #include <linux/ksm.h>
> +#include <linux/processor.h>

This seems benign enough, just wonder why you need it?

>
>  #include <linux/uaccess.h>
>  #include <asm/cacheflush.h>
>
> --
> 2.45.0
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
  2024-10-09 13:36   ` Lorenzo Stoakes
@ 2024-10-10  0:02     ` Deepak Gupta
  0 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-10  0:02 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Conor Dooley,
	Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, broonie, rick.p.edgecombe

On Wed, Oct 09, 2024 at 02:36:12PM +0100, Lorenzo Stoakes wrote:
>On Tue, Oct 08, 2024 at 03:36:53PM -0700, Deepak Gupta wrote:
>> `arch_calc_vm_prot_bits` is implemented on risc-v to return VM_READ |
>> VM_WRITE if PROT_WRITE is specified. Similarly `riscv_sys_mmap` is
>> updated to convert all incoming PROT_WRITE to (PROT_WRITE | PROT_READ).
>> This is to make sure that any existing apps using PROT_WRITE still work.
>>
>> Earlier `protection_map[VM_WRITE]` used to pick read-write PTE encodings.
>> Now `protection_map[VM_WRITE]` will always pick PAGE_SHADOWSTACK PTE
>> encodings for shadow stack. Above changes ensure that existing apps
>> continue to work because underneath kernel will be picking
>> `protection_map[VM_WRITE|VM_READ]` PTE encodings.
>>
>> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>> ---
>>  arch/riscv/include/asm/mman.h    | 24 ++++++++++++++++++++++++
>>  arch/riscv/include/asm/pgtable.h |  1 +
>>  arch/riscv/kernel/sys_riscv.c    | 10 ++++++++++
>>  arch/riscv/mm/init.c             |  2 +-
>>  mm/mmap.c                        |  1 +
>>  5 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/include/asm/mman.h b/arch/riscv/include/asm/mman.h
>> new file mode 100644
>> index 000000000000..ef9fedf32546
>> --- /dev/null
>> +++ b/arch/riscv/include/asm/mman.h
>> @@ -0,0 +1,24 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __ASM_MMAN_H__
>> +#define __ASM_MMAN_H__
>> +
>> +#include <linux/compiler.h>
>> +#include <linux/types.h>
>> +#include <uapi/asm/mman.h>
>> +
>> +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>> +	unsigned long pkey __always_unused)
>> +{
>> +	unsigned long ret = 0;
>> +
>> +	/*
>> +	 * If PROT_WRITE was specified, force it to VM_READ | VM_WRITE.
>> +	 * Only VM_WRITE means shadow stack.
>> +	 */
>> +	if (prot & PROT_WRITE)
>> +		ret = (VM_READ | VM_WRITE);
>> +	return ret;
>> +}
>> +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>> +
>> +#endif /* ! __ASM_MMAN_H__ */
>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> index e79f15293492..4948a1f18ae8 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -177,6 +177,7 @@ extern struct pt_alloc_ops pt_ops __meminitdata;
>>  #define PAGE_READ_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
>>  #define PAGE_WRITE_EXEC		__pgprot(_PAGE_BASE | _PAGE_READ |	\
>>  					 _PAGE_EXEC | _PAGE_WRITE)
>> +#define PAGE_SHADOWSTACK       __pgprot(_PAGE_BASE | _PAGE_WRITE)
>>
>>  #define PAGE_COPY		PAGE_READ
>>  #define PAGE_COPY_EXEC		PAGE_READ_EXEC
>> diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
>> index d77afe05578f..43a448bf254b 100644
>> --- a/arch/riscv/kernel/sys_riscv.c
>> +++ b/arch/riscv/kernel/sys_riscv.c
>> @@ -7,6 +7,7 @@
>>
>>  #include <linux/syscalls.h>
>>  #include <asm/cacheflush.h>
>> +#include <asm-generic/mman-common.h>
>>
>>  static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>>  			   unsigned long prot, unsigned long flags,
>> @@ -16,6 +17,15 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len,
>>  	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
>>  		return -EINVAL;
>>
>> +	/*
>> +	 * If PROT_WRITE is specified then extend that to PROT_READ
>> +	 * protection_map[VM_WRITE] is now going to select shadow stack encodings.
>> +	 * So specifying PROT_WRITE actually should select protection_map [VM_WRITE | VM_READ]
>> +	 * If user wants to create shadow stack then they should use `map_shadow_stack` syscall.
>> +	 */
>> +	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
>> +		prot |= PROT_READ;
>> +
>>  	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
>>  			       offset >> (PAGE_SHIFT - page_shift_offset));
>>  }
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 0e8c20adcd98..964810aeb405 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -326,7 +326,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
>>  static const pgprot_t protection_map[16] = {
>>  	[VM_NONE]					= PAGE_NONE,
>>  	[VM_READ]					= PAGE_READ,
>> -	[VM_WRITE]					= PAGE_COPY,
>> +	[VM_WRITE]					= PAGE_SHADOWSTACK,
>>  	[VM_WRITE | VM_READ]				= PAGE_COPY,
>>  	[VM_EXEC]					= PAGE_EXEC,
>>  	[VM_EXEC | VM_READ]				= PAGE_READ_EXEC,
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index dd4b35a25aeb..b56f1e8cbfc6 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -47,6 +47,7 @@
>>  #include <linux/oom.h>
>>  #include <linux/sched/mm.h>
>>  #include <linux/ksm.h>
>> +#include <linux/processor.h>
>
>This seems benign enough, just wonder why you need it?

I think leftover from previous versions. Will remove it.
Don't think its needed here anymore.

>
>>
>>  #include <linux/uaccess.h>
>>  #include <asm/cacheflush.h>
>>
>> --
>> 2.45.0
>>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-08 22:37 ` [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi Deepak Gupta
@ 2024-10-11  5:44   ` Zong Li
  2024-10-11 10:18     ` Mark Brown
  0 siblings, 1 reply; 52+ messages in thread
From: Zong Li @ 2024-10-11  5:44 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, broonie, rick.p.edgecombe

On Wed, Oct 9, 2024 at 7:46 AM Deepak Gupta <debug@rivosinc.com> wrote:
>
> Adds kselftest for RISC-V control flow integrity implementation for user
> mode. There is not a lot going on in kernel for enabling landing pad for
> user mode. cfi selftest are intended to be compiled with zicfilp and
> zicfiss enabled compiler. Thus kselftest simply checks if landing pad and
> shadow stack for the binary and process are enabled or not. selftest then
> register a signal handler for SIGSEGV. Any control flow violation are
> reported as SIGSEGV with si_code = SEGV_CPERR. Test will fail on receiving
> any SEGV_CPERR. Shadow stack part has more changes in kernel and thus there
> are separate tests for that
>
> - Exercise `map_shadow_stack` syscall
> - `fork` test to make sure COW works for shadow stack pages
> - gup tests
>   Kernel uses FOLL_FORCE when access happens to memory via
>   /proc/<pid>/mem. Not breaking that for shadow stack.
> - signal test. Make sure signal delivery results in token creation on
>   shadow stack and consumes (and verifies) token on sigreturn
> - shadow stack protection test. attempts to write using regular store
>   instruction on shadow stack memory must result in access faults
>
> Test outut
> ==========
>
> """
> TAP version 13
> 1..5
>   This is to ensure shadow stack is indeed enabled and working
>   This is to ensure shadow stack is indeed enabled and working
> ok 1 shstk fork test
> ok 2 map shadow stack syscall
> ok 3 shadow stack gup tests
> ok 4 shadow stack signal tests
> ok 5 memory protections of shadow stack memory
> """
>
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
>  tools/testing/selftests/riscv/Makefile             |   2 +-
>  tools/testing/selftests/riscv/cfi/.gitignore       |   3 +
>  tools/testing/selftests/riscv/cfi/Makefile         |  10 +
>  tools/testing/selftests/riscv/cfi/cfi_rv_test.h    |  84 +++++
>  tools/testing/selftests/riscv/cfi/riscv_cfi_test.c |  78 +++++
>  tools/testing/selftests/riscv/cfi/shadowstack.c    | 373 +++++++++++++++++++++
>  tools/testing/selftests/riscv/cfi/shadowstack.h    |  37 ++
>  7 files changed, 586 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile
> index 7ce03d832b64..6e142fe004ab 100644
> --- a/tools/testing/selftests/riscv/Makefile
> +++ b/tools/testing/selftests/riscv/Makefile
> @@ -5,7 +5,7 @@
>  ARCH ?= $(shell uname -m 2>/dev/null || echo not)
>
>  ifneq (,$(filter $(ARCH),riscv))
> -RISCV_SUBTARGETS ?= hwprobe vector mm sigreturn
> +RISCV_SUBTARGETS ?= hwprobe vector mm sigreturn cfi
>  else
>  RISCV_SUBTARGETS :=
>  endif
> diff --git a/tools/testing/selftests/riscv/cfi/.gitignore b/tools/testing/selftests/riscv/cfi/.gitignore
> new file mode 100644
> index 000000000000..82545863bac6
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/.gitignore
> @@ -0,0 +1,3 @@
> +cfitests
> +riscv_cfi_test
> +shadowstack
> diff --git a/tools/testing/selftests/riscv/cfi/Makefile b/tools/testing/selftests/riscv/cfi/Makefile
> new file mode 100644
> index 000000000000..b65f7ff38a32
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/Makefile
> @@ -0,0 +1,10 @@
> +CFLAGS += -I$(top_srcdir)/tools/include
> +
> +CFLAGS += -march=rv64gc_zicfilp_zicfiss
> +
> +TEST_GEN_PROGS := cfitests
> +
> +include ../../lib.mk
> +
> +$(OUTPUT)/cfitests: riscv_cfi_test.c shadowstack.c
> +       $(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^
> diff --git a/tools/testing/selftests/riscv/cfi/cfi_rv_test.h b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h
> new file mode 100644
> index 000000000000..0fefdc33f71e
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h
> @@ -0,0 +1,84 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef SELFTEST_RISCV_CFI_H
> +#define SELFTEST_RISCV_CFI_H
> +#include <stddef.h>
> +#include <sys/types.h>
> +#include "shadowstack.h"
> +
> +#define RISCV_CFI_SELFTEST_COUNT RISCV_SHADOW_STACK_TESTS
> +
> +#define CHILD_EXIT_CODE_SSWRITE                10
> +#define CHILD_EXIT_CODE_SIG_TEST       11
> +
> +#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5)                 \
> +({                                                                     \
> +       register long _num  __asm__ ("a7") = (num);                     \
> +       register long _arg1 __asm__ ("a0") = (long)(arg1);              \
> +       register long _arg2 __asm__ ("a1") = (long)(arg2);              \
> +       register long _arg3 __asm__ ("a2") = (long)(arg3);              \
> +       register long _arg4 __asm__ ("a3") = (long)(arg4);              \
> +       register long _arg5 __asm__ ("a4") = (long)(arg5);              \
> +                                                                       \
> +       __asm__ volatile(                                               \
> +               "ecall\n"                                               \
> +               : "+r"                                                  \
> +               (_arg1)                                                 \
> +               : "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5),       \
> +                 "r"(_num)                                             \
> +               : "memory", "cc"                                        \
> +       );                                                              \
> +       _arg1;                                                          \
> +})
> +
> +#define my_syscall3(num, arg1, arg2, arg3)                             \
> +({                                                                     \
> +       register long _num  __asm__ ("a7") = (num);                     \
> +       register long _arg1 __asm__ ("a0") = (long)(arg1);              \
> +       register long _arg2 __asm__ ("a1") = (long)(arg2);              \
> +       register long _arg3 __asm__ ("a2") = (long)(arg3);              \
> +                                                                       \
> +       __asm__ volatile(                                               \
> +               "ecall\n"                                               \
> +               : "+r" (_arg1)                                          \
> +               : "r"(_arg2), "r"(_arg3),                               \
> +                 "r"(_num)                                             \
> +               : "memory", "cc"                                        \
> +       );                                                              \
> +       _arg1;                                                          \
> +})
> +
> +#ifndef __NR_prctl
> +#define __NR_prctl 167
> +#endif
> +
> +#ifndef __NR_map_shadow_stack
> +#define __NR_map_shadow_stack 453
> +#endif
> +
> +#define CSR_SSP 0x011
> +
> +#ifdef __ASSEMBLY__
> +#define __ASM_STR(x)    x
> +#else
> +#define __ASM_STR(x)    #x
> +#endif
> +
> +#define csr_read(csr)                                                  \
> +({                                                                     \
> +       register unsigned long __v;                                     \
> +       __asm__ __volatile__ ("csrr %0, " __ASM_STR(csr)                \
> +                               : "=r" (__v) :                          \
> +                               : "memory");                            \
> +       __v;                                                            \
> +})
> +
> +#define csr_write(csr, val)                                            \
> +({                                                                     \
> +       unsigned long __v = (unsigned long) (val);                      \
> +       __asm__ __volatile__ ("csrw " __ASM_STR(csr) ", %0"             \
> +                               : : "rK" (__v)                          \
> +                               : "memory");                            \
> +})
> +
> +#endif
> diff --git a/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c b/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c
> new file mode 100644
> index 000000000000..720a001f7c31
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/riscv_cfi_test.c
> @@ -0,0 +1,78 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include "../../kselftest.h"
> +#include <signal.h>
> +#include <asm/ucontext.h>
> +#include <linux/prctl.h>
> +#include "cfi_rv_test.h"
> +
> +/* do not optimize cfi related test functions */
> +#pragma GCC push_options
> +#pragma GCC optimize("O0")
> +
> +void sigsegv_handler(int signum, siginfo_t *si, void *uc)
> +{
> +       struct ucontext *ctx = (struct ucontext *) uc;
> +
> +       if (si->si_code == SEGV_CPERR) {

Hi Deepak,
I got some errors when building this test, I suppose they should be
fixed in the next version.

riscv_cfi_test.c: In function 'sigsegv_handler':
riscv_cfi_test.c:17:28: error: 'SEGV_CPERR' undeclared (first use in
this function); did you mean 'SEGV_ACCERR'?
   17 |         if (si->si_code == SEGV_CPERR) {
      |                            ^~~~~~~~~~
      |                            SEGV_ACCERR


> +               ksft_print_msg("Control flow violation happened somewhere\n");
> +               ksft_print_msg("PC where violation happened %lx\n", ctx->uc_mcontext.gregs[0]);
> +               exit(-1);
> +       }
> +
> +       /* all other cases are expected to be of shadow stack write case */
> +       exit(CHILD_EXIT_CODE_SSWRITE);
> +}
> +
> +bool register_signal_handler(void)
> +{
> +       struct sigaction sa = {};
> +
> +       sa.sa_sigaction = sigsegv_handler;
> +       sa.sa_flags = SA_SIGINFO;
> +       if (sigaction(SIGSEGV, &sa, NULL)) {
> +               ksft_print_msg("Registering signal handler for landing pad violation failed\n");
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +       int ret = 0;
> +       unsigned long lpad_status = 0, ss_status = 0;
> +
> +       ksft_print_header();
> +
> +       ksft_print_msg("Starting risc-v tests\n");
> +
> +       /*
> +        * Landing pad test. Not a lot of kernel changes to support landing
> +        * pad for user mode except lighting up a bit in senvcfg via a prctl
> +        * Enable landing pad through out the execution of test binary
> +        */
> +       ret = my_syscall5(__NR_prctl, PR_GET_INDIR_BR_LP_STATUS, &lpad_status, 0, 0, 0);
> +       if (ret)
> +               ksft_exit_fail_msg("Get landing pad status failed with %d\n", ret);
> +
> +       if (!(lpad_status & PR_INDIR_BR_LP_ENABLE))
> +               ksft_exit_fail_msg("Landing pad is not enabled, should be enabled via glibc\n");
> +
> +       ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
> +       if (ret)
> +               ksft_exit_fail_msg("Get shadow stack failed with %d\n", ret);
> +
> +       if (!(ss_status & PR_SHADOW_STACK_ENABLE))
> +               ksft_exit_fail_msg("Shadow stack is not enabled, should be enabled via glibc\n");
> +
> +       if (!register_signal_handler())
> +               ksft_exit_fail_msg("Registering signal handler for SIGSEGV failed\n");
> +
> +       ksft_print_msg("Landing pad and shadow stack are enabled for binary\n");
> +       execute_shadow_stack_tests();
> +
> +       return 0;
> +}
> +
> +#pragma GCC pop_options
> diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.c b/tools/testing/selftests/riscv/cfi/shadowstack.c
> new file mode 100644
> index 000000000000..9d5301914578
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/shadowstack.c
> @@ -0,0 +1,373 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include "../../kselftest.h"
> +#include <sys/wait.h>
> +#include <signal.h>
> +#include <fcntl.h>
> +#include <asm-generic/unistd.h>
> +#include <sys/mman.h>
> +#include "shadowstack.h"
> +#include "cfi_rv_test.h"
> +
> +/* do not optimize shadow stack related test functions */
> +#pragma GCC push_options
> +#pragma GCC optimize("O0")
> +
> +void zar(void)
> +{
> +       unsigned long ssp = 0;
> +
> +       ssp = csr_read(CSR_SSP);
> +       ksft_print_msg("Spewing out shadow stack ptr: %lx\n"
> +                       "  This is to ensure shadow stack is indeed enabled and working\n",
> +                       ssp);
> +}
> +
> +void bar(void)
> +{
> +       zar();
> +}
> +
> +void foo(void)
> +{
> +       bar();
> +}
> +
> +void zar_child(void)
> +{
> +       unsigned long ssp = 0;
> +
> +       ssp = csr_read(CSR_SSP);
> +       ksft_print_msg("Spewing out shadow stack ptr: %lx\n"
> +                       "  This is to ensure shadow stack is indeed enabled and working\n",
> +                       ssp);
> +}
> +
> +void bar_child(void)
> +{
> +       zar_child();
> +}
> +
> +void foo_child(void)
> +{
> +       bar_child();
> +}
> +
> +typedef void (call_func_ptr)(void);
> +/*
> + * call couple of functions to test push pop.
> + */
> +int shadow_stack_call_tests(call_func_ptr fn_ptr, bool parent)
> +{
> +       ksft_print_msg("Exercising dummy calls for sspush and sspopchk in"
> +                       " context of %s\n", parent ? "parent" : "child");
> +
> +       (fn_ptr)();
> +
> +       return 0;
> +}
> +
> +/* forks a thread, and ensure shadow stacks fork out */
> +bool shadow_stack_fork_test(unsigned long test_num, void *ctx)
> +{
> +       int pid = 0, child_status = 0, parent_pid = 0, ret = 0;
> +       unsigned long ss_status = 0;
> +
> +       ksft_print_msg("Exercising shadow stack fork test\n");
> +
> +       ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
> +       if (ret) {
> +               ksft_exit_skip("Shadow stack get status prctl failed with errorcode %d\n", ret);
> +               return false;
> +       }
> +
> +       if (!(ss_status & PR_SHADOW_STACK_ENABLE))
> +               ksft_exit_skip("Shadow stack is not enabled, should be enabled via glibc\n");
> +
> +       parent_pid = getpid();
> +       pid = fork();
> +
> +       if (pid) {
> +               ksft_print_msg("Parent pid %d and child pid %d\n", parent_pid, pid);
> +               shadow_stack_call_tests(&foo, true);
> +       } else
> +               shadow_stack_call_tests(&foo_child, false);
> +
> +       if (pid) {
> +               ksft_print_msg("Waiting on child to finish\n");
> +               wait(&child_status);
> +       } else {
> +               /* exit child gracefully */
> +               exit(0);
> +       }
> +
> +       if (pid && WIFSIGNALED(child_status)) {
> +               ksft_print_msg("Child faulted, fork test failed\n");
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +/* exercise `map_shadow_stack`, pivot to it and call some functions to ensure it works */
> +#define SHADOW_STACK_ALLOC_SIZE 4096
> +bool shadow_stack_map_test(unsigned long test_num, void *ctx)
> +{
> +       unsigned long shdw_addr;
> +       int ret = 0;
> +
> +       ksft_print_msg("Exercising shadow stack map test\n");
> +
> +       shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
> +
> +       if (((long) shdw_addr) <= 0) {
> +               ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
> +               return false;
> +       }
> +
> +       ret = munmap((void *) shdw_addr, SHADOW_STACK_ALLOC_SIZE);
> +
> +       if (ret) {
> +               ksft_print_msg("munmap failed with error code %d\n", ret);
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +/*
> + * shadow stack protection tests. map a shadow stack and
> + * validate all memory protections work on it
> + */
> +bool shadow_stack_protection_test(unsigned long test_num, void *ctx)
> +{
> +       unsigned long shdw_addr;
> +       unsigned long *write_addr = NULL;
> +       int ret = 0, pid = 0, child_status = 0;
> +
> +       ksft_print_msg("Exercising shadow stack protection test\n");
> +
> +       shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
> +
> +       if (((long) shdw_addr) <= 0) {
> +               ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
> +               return false;
> +       }
> +
> +       write_addr = (unsigned long *) shdw_addr;
> +       pid = fork();
> +
> +       /* no child was created, return false */
> +       if (pid == -1)
> +               return false;
> +
> +       /*
> +        * try to perform a store from child on shadow stack memory
> +        * it should result in SIGSEGV
> +        */
> +       if (!pid) {
> +               /* below write must lead to SIGSEGV */
> +               *write_addr = 0xdeadbeef;
> +       } else {
> +               wait(&child_status);
> +       }
> +
> +       /* test fail, if 0xdeadbeef present on shadow stack address */
> +       if (*write_addr == 0xdeadbeef) {
> +               ksft_print_msg("Write suceeded on shadow stack memory, shadow stack protection test"
> +               " failed\n");
> +               return false;
> +       }
> +
> +       /* if child reached here, then fail */
> +       if (!pid) {
> +               ksft_print_msg("Shadow stack protection test: child reached unreachable state\n");
> +               return false;
> +       }
> +
> +       /* if child exited via signal handler but not for write on ss */
> +       if (WIFEXITED(child_status) &&
> +               WEXITSTATUS(child_status) != CHILD_EXIT_CODE_SSWRITE) {
> +               ksft_print_msg("Shadow stack protection test: child wasn't signaled for write on"
> +               " shadow stack\n");
> +               return false;
> +       }
> +
> +       ret = munmap(write_addr, SHADOW_STACK_ALLOC_SIZE);
> +       if (ret) {
> +               ksft_print_msg("Shadow stack protection test: munmap failed with error code %d\n",
> +               ret);
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +#define SS_MAGIC_WRITE_VAL 0xbeefdead
> +
> +int gup_tests(int mem_fd, unsigned long *shdw_addr)
> +{
> +       unsigned long val = 0;
> +
> +       lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET);
> +       if (read(mem_fd, &val, sizeof(val)) < 0) {
> +               ksft_print_msg("Reading shadow stack mem via gup failed\n");
> +               return 1;
> +       }
> +
> +       val = SS_MAGIC_WRITE_VAL;
> +       lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET);
> +       if (write(mem_fd, &val, sizeof(val)) < 0) {
> +               ksft_print_msg("Writing shadow stack mem via gup failed\n");
> +               return 1;
> +       }
> +
> +       if (*shdw_addr != SS_MAGIC_WRITE_VAL) {
> +               ksft_print_msg("GUP write to shadow stack memory failed\n");
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx)
> +{
> +       unsigned long shdw_addr = 0;
> +       unsigned long *write_addr = NULL;
> +       int fd = 0;
> +       bool ret = false;
> +
> +       ksft_print_msg("Exercising shadow stack gup tests\n");
> +       shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0);
> +
> +       if (((long) shdw_addr) <= 0) {
> +               ksft_print_msg("map_shadow_stack failed with error code %d\n", (int) shdw_addr);
> +               return false;
> +       }
> +
> +       write_addr = (unsigned long *) shdw_addr;
> +
> +       fd = open("/proc/self/mem", O_RDWR);
> +       if (fd == -1)
> +               return false;
> +
> +       if (gup_tests(fd, write_addr)) {
> +               ksft_print_msg("gup tests failed\n");
> +               goto out;
> +       }
> +
> +       ret = true;
> +out:
> +       if (shdw_addr && munmap(write_addr, SHADOW_STACK_ALLOC_SIZE)) {
> +               ksft_print_msg("munmap failed with error code %d\n", ret);
> +               ret = false;
> +       }
> +
> +       return ret;
> +}
> +
> +volatile bool break_loop;
> +
> +void sigusr1_handler(int signo)
> +{
> +       break_loop = true;
> +}
> +
> +bool sigusr1_signal_test(void)
> +{
> +       struct sigaction sa = {};
> +
> +       sa.sa_handler = sigusr1_handler;
> +       sa.sa_flags = 0;
> +       sigemptyset(&sa.sa_mask);
> +       if (sigaction(SIGUSR1, &sa, NULL)) {
> +               ksft_print_msg("Registering signal handler for SIGUSR1 failed\n");
> +               return false;
> +       }
> +
> +       return true;
> +}
> +/*
> + * shadow stack signal test. shadow stack must be enabled.
> + * register a signal, fork another thread which is waiting
> + * on signal. Send a signal from parent to child, verify
> + * that signal was received by child. If not test fails
> + */
> +bool shadow_stack_signal_test(unsigned long test_num, void *ctx)
> +{
> +       int pid = 0, child_status = 0, ret = 0;
> +       unsigned long ss_status = 0;
> +
> +       ksft_print_msg("Exercising shadow stack signal test\n");
> +
> +       ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0);
> +       if (ret) {
> +               ksft_print_msg("Shadow stack get status prctl failed with errorcode %d\n", ret);
> +               return false;
> +       }
> +
> +       if (!(ss_status & PR_SHADOW_STACK_ENABLE))
> +               ksft_print_msg("Shadow stack is not enabled, should be enabled via glibc\n");
> +
> +       /* this should be caught by signal handler and do an exit */
> +       if (!sigusr1_signal_test()) {
> +               ksft_print_msg("Registering sigusr1 handler failed\n");
> +               exit(-1);
> +       }
> +
> +       pid = fork();
> +
> +       if (pid == -1) {
> +               ksft_print_msg("Signal test: fork failed\n");
> +               goto out;
> +       }
> +
> +       if (pid == 0) {
> +               while (!break_loop)
> +                       sleep(1);
> +
> +               exit(11);
> +               /* child shouldn't go beyond here */
> +       }
> +
> +       /* send SIGUSR1 to child */
> +       kill(pid, SIGUSR1);
> +       wait(&child_status);
> +
> +out:
> +
> +       return (WIFEXITED(child_status) &&
> +                       WEXITSTATUS(child_status) == 11);
> +}
> +
> +int execute_shadow_stack_tests(void)
> +{
> +       int ret = 0;
> +       unsigned long test_count = 0;
> +       unsigned long shstk_status = 0;
> +       bool test_pass = false;
> +
> +       ksft_print_msg("Executing RISC-V shadow stack self tests\n");
> +       ksft_set_plan(RISCV_SHADOW_STACK_TESTS);
> +
> +       ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &shstk_status, 0, 0, 0);
> +
> +       if (ret != 0)
> +               ksft_exit_fail_msg("Get shadow stack status failed with %d\n", ret);
> +
> +       /*
> +        * If we are here that means get shadow stack status succeeded and
> +        * thus shadow stack support is baked in the kernel.
> +        */
> +       while (test_count < ARRAY_SIZE(shstk_tests)) {
> +               test_pass = (*shstk_tests[test_count].t_func)(test_count, NULL);
> +               ksft_test_result(test_pass, shstk_tests[test_count].name);
> +               test_count++;
> +       }
> +
> +       ksft_finished();
> +
> +       return 0;
> +}
> +
> +#pragma GCC pop_options
> diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.h b/tools/testing/selftests/riscv/cfi/shadowstack.h
> new file mode 100644
> index 000000000000..b43e74136a26
> --- /dev/null
> +++ b/tools/testing/selftests/riscv/cfi/shadowstack.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef SELFTEST_SHADOWSTACK_TEST_H
> +#define SELFTEST_SHADOWSTACK_TEST_H
> +#include <stddef.h>
> +#include <linux/prctl.h>
> +
> +/*
> + * a cfi test returns true for success or false for fail
> + * takes a number for test number to index into array and void pointer.
> + */
> +typedef bool (*shstk_test_func)(unsigned long test_num, void *);
> +
> +struct shadow_stack_tests {
> +       char *name;
> +       shstk_test_func t_func;
> +};
> +
> +bool shadow_stack_fork_test(unsigned long test_num, void *ctx);
> +bool shadow_stack_map_test(unsigned long test_num, void *ctx);
> +bool shadow_stack_protection_test(unsigned long test_num, void *ctx);
> +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx);
> +bool shadow_stack_signal_test(unsigned long test_num, void *ctx);
> +
> +static struct shadow_stack_tests shstk_tests[] = {

shadowstack.h:25:34: warning: 'shstk_tests' defined but not used
[-Wunused-variable]
   25 | static struct shadow_stack_tests shstk_tests[] = {

> +       { "shstk fork test\n", shadow_stack_fork_test },
> +       { "map shadow stack syscall\n", shadow_stack_map_test },
> +       { "shadow stack gup tests\n", shadow_stack_gup_tests },
> +       { "shadow stack signal tests\n", shadow_stack_signal_test},
> +       { "memory protections of shadow stack memory\n", shadow_stack_protection_test }
> +};
> +
> +#define RISCV_SHADOW_STACK_TESTS ARRAY_SIZE(shstk_tests)
> +
> +int execute_shadow_stack_tests(void);
> +
> +#endif
>
> --
> 2.45.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-11  5:44   ` Zong Li
@ 2024-10-11 10:18     ` Mark Brown
  2024-10-11 11:43       ` Zong Li
  0 siblings, 1 reply; 52+ messages in thread
From: Mark Brown @ 2024-10-11 10:18 UTC (permalink / raw)
  To: Zong Li
  Cc: Deepak Gupta, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Andrew Morton, Liam R. Howlett,
	Vlastimil Babka, Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Rob Herring, Krzysztof Kozlowski,
	Arnd Bergmann, Christian Brauner, Peter Zijlstra, Oleg Nesterov,
	Eric Biederman, Kees Cook, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, rick.p.edgecombe

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

On Fri, Oct 11, 2024 at 01:44:55PM +0800, Zong Li wrote:
> On Wed, Oct 9, 2024 at 7:46 AM Deepak Gupta <debug@rivosinc.com> wrote:

> > +       if (si->si_code == SEGV_CPERR) {

> Hi Deepak,
> I got some errors when building this test, I suppose they should be
> fixed in the next version.

> riscv_cfi_test.c: In function 'sigsegv_handler':
> riscv_cfi_test.c:17:28: error: 'SEGV_CPERR' undeclared (first use in
> this function); did you mean 'SEGV_ACCERR'?
>    17 |         if (si->si_code == SEGV_CPERR) {
>       |                            ^~~~~~~~~~
>       |                            SEGV_ACCERR
> 

Did you run "make headers_install" prior to building kselftest to get
the current kernel's headers available for userspace builds?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-11 10:18     ` Mark Brown
@ 2024-10-11 11:43       ` Zong Li
  2024-10-11 19:45         ` Deepak Gupta
  0 siblings, 1 reply; 52+ messages in thread
From: Zong Li @ 2024-10-11 11:43 UTC (permalink / raw)
  To: Mark Brown
  Cc: Deepak Gupta, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Andrew Morton, Liam R. Howlett,
	Vlastimil Babka, Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Rob Herring, Krzysztof Kozlowski,
	Arnd Bergmann, Christian Brauner, Peter Zijlstra, Oleg Nesterov,
	Eric Biederman, Kees Cook, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, rick.p.edgecombe

On Fri, Oct 11, 2024 at 6:18 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Oct 11, 2024 at 01:44:55PM +0800, Zong Li wrote:
> > On Wed, Oct 9, 2024 at 7:46 AM Deepak Gupta <debug@rivosinc.com> wrote:
>
> > > +       if (si->si_code == SEGV_CPERR) {
>
> > Hi Deepak,
> > I got some errors when building this test, I suppose they should be
> > fixed in the next version.
>
> > riscv_cfi_test.c: In function 'sigsegv_handler':
> > riscv_cfi_test.c:17:28: error: 'SEGV_CPERR' undeclared (first use in
> > this function); did you mean 'SEGV_ACCERR'?
> >    17 |         if (si->si_code == SEGV_CPERR) {
> >       |                            ^~~~~~~~~~
> >       |                            SEGV_ACCERR
> >
>
> Did you run "make headers_install" prior to building kselftest to get
> the current kernel's headers available for userspace builds?

Yes, I have run "make header" and "make header_install" before
building the kselftest. This error happens when I cross compiled it,
perhaps I can help to check if it is missing some header files or
header search path.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-11 11:43       ` Zong Li
@ 2024-10-11 19:45         ` Deepak Gupta
  2024-10-14 14:33           ` Zong Li
  0 siblings, 1 reply; 52+ messages in thread
From: Deepak Gupta @ 2024-10-11 19:45 UTC (permalink / raw)
  To: Zong Li
  Cc: Mark Brown, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Andrew Morton, Liam R. Howlett,
	Vlastimil Babka, Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Rob Herring, Krzysztof Kozlowski,
	Arnd Bergmann, Christian Brauner, Peter Zijlstra, Oleg Nesterov,
	Eric Biederman, Kees Cook, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, rick.p.edgecombe

On Fri, Oct 11, 2024 at 07:43:30PM +0800, Zong Li wrote:
>On Fri, Oct 11, 2024 at 6:18 PM Mark Brown <broonie@kernel.org> wrote:
>>
>> On Fri, Oct 11, 2024 at 01:44:55PM +0800, Zong Li wrote:
>> > On Wed, Oct 9, 2024 at 7:46 AM Deepak Gupta <debug@rivosinc.com> wrote:
>>
>> > > +       if (si->si_code == SEGV_CPERR) {
>>
>> > Hi Deepak,
>> > I got some errors when building this test, I suppose they should be
>> > fixed in the next version.
>>
>> > riscv_cfi_test.c: In function 'sigsegv_handler':
>> > riscv_cfi_test.c:17:28: error: 'SEGV_CPERR' undeclared (first use in
>> > this function); did you mean 'SEGV_ACCERR'?
>> >    17 |         if (si->si_code == SEGV_CPERR) {
>> >       |                            ^~~~~~~~~~
>> >       |                            SEGV_ACCERR
>> >
>>
>> Did you run "make headers_install" prior to building kselftest to get
>> the current kernel's headers available for userspace builds?
>
>Yes, I have run "make header" and "make header_install" before
>building the kselftest. This error happens when I cross compiled it,
>perhaps I can help to check if it is missing some header files or
>header search path.

That's wierd.

It doesn't fail for me even if I do not do `make headers_install`. But I am
building kernel and selftests with toolchain which supports shadow stack and
landing pad. It's defined in `siginfo.h`. When I built toolchain, I did point
it at the latest kernel headers. May be that's the trick.

"""

$ grep -nir SEGV_CPERR /scratch/debug/linux/kbuild/usr/include/*
/scratch/debug/linux/kbuild/usr/include/asm-generic/siginfo.h:240:#define SEGV_CPERR    10      /* Control protection fault */

$ grep -nir SEGV_CPERR /scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/*
/scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/asm-generic/siginfo.h:240:#define SEGV_CPERR    10      /* Control protection fault */
/scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/bits/siginfo-consts.h:139:  SEGV_CPERR                  /* Control protection fault.  */
/scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/bits/siginfo-consts.h:140:#  define SEGV_CPERR  SEGV_CPERR

"""



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi
  2024-10-11 19:45         ` Deepak Gupta
@ 2024-10-14 14:33           ` Zong Li
  0 siblings, 0 replies; 52+ messages in thread
From: Zong Li @ 2024-10-14 14:33 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: Mark Brown, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Andrew Morton, Liam R. Howlett,
	Vlastimil Babka, Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Rob Herring, Krzysztof Kozlowski,
	Arnd Bergmann, Christian Brauner, Peter Zijlstra, Oleg Nesterov,
	Eric Biederman, Kees Cook, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-fsdevel, linux-mm, linux-riscv, devicetree,
	linux-arch, linux-doc, linux-kselftest, alistair.francis,
	richard.henderson, jim.shu, andybnac, kito.cheng, charlie, atishp,
	evan, cleger, alexghiti, samitolvanen, rick.p.edgecombe

On Sat, Oct 12, 2024 at 3:46 AM Deepak Gupta <debug@rivosinc.com> wrote:
>
> On Fri, Oct 11, 2024 at 07:43:30PM +0800, Zong Li wrote:
> >On Fri, Oct 11, 2024 at 6:18 PM Mark Brown <broonie@kernel.org> wrote:
> >>
> >> On Fri, Oct 11, 2024 at 01:44:55PM +0800, Zong Li wrote:
> >> > On Wed, Oct 9, 2024 at 7:46 AM Deepak Gupta <debug@rivosinc.com> wrote:
> >>
> >> > > +       if (si->si_code == SEGV_CPERR) {
> >>
> >> > Hi Deepak,
> >> > I got some errors when building this test, I suppose they should be
> >> > fixed in the next version.
> >>
> >> > riscv_cfi_test.c: In function 'sigsegv_handler':
> >> > riscv_cfi_test.c:17:28: error: 'SEGV_CPERR' undeclared (first use in
> >> > this function); did you mean 'SEGV_ACCERR'?
> >> >    17 |         if (si->si_code == SEGV_CPERR) {
> >> >       |                            ^~~~~~~~~~
> >> >       |                            SEGV_ACCERR
> >> >
> >>
> >> Did you run "make headers_install" prior to building kselftest to get
> >> the current kernel's headers available for userspace builds?
> >
> >Yes, I have run "make header" and "make header_install" before
> >building the kselftest. This error happens when I cross compiled it,
> >perhaps I can help to check if it is missing some header files or
> >header search path.
>
> That's wierd.
>
> It doesn't fail for me even if I do not do `make headers_install`. But I am
> building kernel and selftests with toolchain which supports shadow stack and
> landing pad. It's defined in `siginfo.h`. When I built toolchain, I did point
> it at the latest kernel headers. May be that's the trick.
>
> """
>
> $ grep -nir SEGV_CPERR /scratch/debug/linux/kbuild/usr/include/*
> /scratch/debug/linux/kbuild/usr/include/asm-generic/siginfo.h:240:#define SEGV_CPERR    10      /* Control protection fault */
>
> $ grep -nir SEGV_CPERR /scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/*
> /scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/asm-generic/siginfo.h:240:#define SEGV_CPERR    10      /* Control protection fault */
> /scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/bits/siginfo-consts.h:139:  SEGV_CPERR                  /* Control protection fault.  */
> /scratch/debug/open_src/sifive_cfi_toolchain/INSTALL_Sept18/sysroot/usr/include/bits/siginfo-consts.h:140:#  define SEGV_CPERR  SEGV_CPERR
>
> """

In my case, because the test files don't explicitly include siginfo.h,
I assume it's expected that siginfo.h will be included through
signal.h. Regarding the header search path, it will eventually locate
signal.h in toolchain_path/sysroot/usr/include/. In my
toolchain_path/sysroot/usr/include/signal.h, it doesn't include any
signal.h; instead, signal.h will be included from
toolchain_path/sysroot/usr/include/linux/signal.h or
kernel_src/usr/include/linux/signal.h rather than
toolchain/sysroot/usr/include/signal.h. I think that is why I lost the
SEGV_CPERR definition. Is there any difference with you?

>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
  2024-10-08 22:36 ` [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) Deepak Gupta
@ 2024-10-25 21:58   ` Rob Herring (Arm)
  0 siblings, 0 replies; 52+ messages in thread
From: Rob Herring (Arm) @ 2024-10-25 21:58 UTC (permalink / raw)
  To: Deepak Gupta
  Cc: H. Peter Anvin, rick.p.edgecombe, richard.henderson, cleger,
	alexghiti, Eric Biederman, Conor Dooley, Krzysztof Kozlowski,
	Vlastimil Babka, linux-mm, samitolvanen, linux-doc,
	Peter Zijlstra, atishp, Paul Walmsley, linux-kernel,
	Thomas Gleixner, linux-arch, Jonathan Corbet, Liam R. Howlett,
	evan, x86, Shuah Khan, devicetree, alistair.francis, broonie,
	Ingo Molnar, Palmer Dabbelt, linux-kselftest, Oleg Nesterov,
	linux-riscv, charlie, Andrew Morton, Lorenzo Stoakes,
	Arnd Bergmann, Albert Ou, Kees Cook, andybnac, linux-fsdevel,
	Dave Hansen, Christian Brauner, kito.cheng, Borislav Petkov,
	jim.shu


On Tue, 08 Oct 2024 15:36:49 -0700, Deepak Gupta wrote:
> Make an entry for cfi extensions in extensions.yaml.
> 
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
>  Documentation/devicetree/bindings/riscv/extensions.yaml | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 

Acked-by: Rob Herring (Arm) <robh@kernel.org>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv
  2024-10-09 11:28   ` Mark Brown
@ 2024-10-29 22:06     ` Deepak Gupta
  0 siblings, 0 replies; 52+ messages in thread
From: Deepak Gupta @ 2024-10-29 22:06 UTC (permalink / raw)
  To: Mark Brown
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Conor Dooley, Rob Herring, Krzysztof Kozlowski, Arnd Bergmann,
	Christian Brauner, Peter Zijlstra, Oleg Nesterov, Eric Biederman,
	Kees Cook, Jonathan Corbet, Shuah Khan, linux-kernel,
	linux-fsdevel, linux-mm, linux-riscv, devicetree, linux-arch,
	linux-doc, linux-kselftest, alistair.francis, richard.henderson,
	jim.shu, andybnac, kito.cheng, charlie, atishp, evan, cleger,
	alexghiti, samitolvanen, rick.p.edgecombe

On Wed, Oct 09, 2024 at 12:28:03PM +0100, Mark Brown wrote:
>On Tue, Oct 08, 2024 at 03:36:48PM -0700, Deepak Gupta wrote:
>
>> riscv will need an implementation for exit_thread to clean up shadow stack
>> when thread exits. If current thread had shadow stack enabled, shadow
>> stack is allocated by default for any new thread.
>
>FWIW both arm64 and x86 do this via deactivate_mm().  ISTR there's some
>case where exit_thread() doesn't quite do the right thing but I can't
>remember the specifics right now, possibly the vfork() case but ICBW?
>In any case like Rick said factoring out the common patterns would be
>good, keeping things aligned would support that.

Now getting back to collecting feedback and sending another version.
Yeah I found what you meant.
https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.com/#t

Seems like an issue for riscv as well. Will fix it.
This particular issue screaming out loud for converging flows as well.




^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2024-10-29 22:06 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08 22:36 [PATCH v6 00/33] riscv control-flow integrity for usermode Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 01/33] mm: Introduce ARCH_HAS_USER_SHADOW_STACK Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 02/33] mm: helper `is_shadow_stack_vma` to check shadow stack vma Deepak Gupta
2024-10-09 11:11   ` Mark Brown
2024-10-08 22:36 ` [PATCH v6 03/33] riscv: Enable cbo.zero only when all harts support Zicboz Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 04/33] riscv: Add support for per-thread envcfg CSR values Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 05/33] riscv: Call riscv_user_isa_enable() only on the boot hart Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 06/33] riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv Deepak Gupta
2024-10-09 11:28   ` Mark Brown
2024-10-29 22:06     ` Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 07/33] dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml) Deepak Gupta
2024-10-25 21:58   ` Rob Herring (Arm)
2024-10-08 22:36 ` [PATCH v6 08/33] riscv: zicfiss / zicfilp enumeration Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 09/33] riscv: zicfiss / zicfilp extension csr and bit definitions Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 10/33] riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 11/33] riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE Deepak Gupta
2024-10-09 13:36   ` Lorenzo Stoakes
2024-10-10  0:02     ` Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 12/33] riscv mm: manufacture shadow stack pte Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 13/33] riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 14/33] riscv mmu: write protect and shadow stack Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 15/33] riscv/mm: Implement map_shadow_stack() syscall Deepak Gupta
2024-10-08 22:36 ` [PATCH v6 16/33] riscv/shstk: If needed allocate a new shadow stack on clone Deepak Gupta
2024-10-08 22:55   ` Edgecombe, Rick P
2024-10-08 23:17     ` Deepak Gupta
2024-10-08 23:31       ` Edgecombe, Rick P
2024-10-09 10:25     ` Mark Brown
2024-10-08 22:36 ` [PATCH v6 17/33] prctl: arch-agnostic prctl for shadow stack Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 18/33] prctl: arch-agnostic prctl for indirect branch tracking Deepak Gupta
2024-10-09 11:03   ` Mark Brown
2024-10-08 22:37 ` [PATCH v6 19/33] riscv: Implements arch agnostic shadow stack prctls Deepak Gupta
2024-10-09 12:44   ` Mark Brown
2024-10-08 22:37 ` [PATCH v6 20/33] riscv: Implements arch agnostic indirect branch tracking prctls Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 21/33] riscv/traps: Introduce software check exception Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 22/33] riscv: signal: abstract header saving for setup_sigcontext Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 23/33] riscv/signal: save and restore of shadow stack for signal Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 24/33] riscv/kernel: update __show_regs to print shadow stack register Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 25/33] riscv/ptrace: riscv cfi status and state via ptrace and in core files Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 26/33] riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 27/33] riscv: Add Firmware Feature SBI extensions definitions Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 28/33] riscv: enable kernel access to shadow stack memory via FWFT sbi call Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 29/33] riscv: kernel command line option to opt out of user cfi Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 30/33] riscv: create a config for shadow stack and landing pad instr support Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 31/33] riscv: Documentation for landing pad / indirect branch tracking Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 32/33] riscv: Documentation for shadow stack on riscv Deepak Gupta
2024-10-08 22:37 ` [PATCH v6 33/33] kselftest/riscv: kselftest for user mode cfi Deepak Gupta
2024-10-11  5:44   ` Zong Li
2024-10-11 10:18     ` Mark Brown
2024-10-11 11:43       ` Zong Li
2024-10-11 19:45         ` Deepak Gupta
2024-10-14 14:33           ` Zong Li
2024-10-09 11:05 ` [PATCH v6 00/33] riscv control-flow integrity for usermode Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).