linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/29] arm64: Permission Overlay Extension
@ 2024-05-03 13:01 Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig Joey Gouly
                   ` (30 more replies)
  0 siblings, 31 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Hi all,

This series implements the Permission Overlay Extension introduced in 2022
VMSA enhancements [1]. It is based on v6.9-rc5.

One possible issue with this version, I took the last bit of HWCAP2.

Changes since v3[2]:
	- Moved Kconfig to nearer the end of the series
	- Reworked MMU Fault path, to check for POE faults earlier, under the mm lock
	- Rework VM_FLAGS to use Kconfig option
	- Don't check POR_EL0 in MTE sync tags function
	- Reworked KVM to fit into VNCR/VM configuration changes
	- Use new AT instruction in KVM
	- Rebase onto v6.9-rc5

The Permission Overlay Extension allows to constrain permissions on memory
regions. This can be used from userspace (EL0) without a system call or TLB
invalidation.

POE is used to implement the Memory Protection Keys [3] Linux syscall.

The first few patches add the basic framework, then the PKEYS interface is
implemented, and then the selftests are made to work on arm64.

I have tested the modified protection_keys test on x86_64, but not PPC.
I haven't build tested the x86/ppc arch changes.

Thanks,
Joey

Joey Gouly (29):
  powerpc/mm: add ARCH_PKEY_BITS to Kconfig
  x86/mm: add ARCH_PKEY_BITS to Kconfig
  mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN
  arm64: disable trapping of POR_EL0 to EL2
  arm64: cpufeature: add Permission Overlay Extension cpucap
  arm64: context switch POR_EL0 register
  KVM: arm64: Save/restore POE registers
  KVM: arm64: make kvm_at() take an OP_AT_*
  KVM: arm64: use `at s1e1a` for POE
  arm64: enable the Permission Overlay Extension for EL0
  arm64: re-order MTE VM_ flags
  arm64: add POIndex defines
  arm64: convert protection key into vm_flags and pgprot values
  arm64: mask out POIndex when modifying a PTE
  arm64: handle PKEY/POE faults
  arm64: add pte_access_permitted_no_overlay()
  arm64: implement PKEYS support
  arm64: add POE signal support
  arm64: enable PKEY support for CPUs with S1POE
  arm64: enable POE and PIE to coexist
  arm64/ptrace: add support for FEAT_POE
  arm64: add Permission Overlay Extension Kconfig
  kselftest/arm64: move get_header()
  selftests: mm: move fpregs printing
  selftests: mm: make protection_keys test work on arm64
  kselftest/arm64: add HWCAP test for FEAT_S1POE
  kselftest/arm64: parse POE_MAGIC in a signal frame
  kselftest/arm64: Add test case for POR_EL0 signal frame records
  KVM: selftests: get-reg-list: add Permission Overlay registers

 Documentation/arch/arm64/elf_hwcaps.rst       |   2 +
 arch/arm64/Kconfig                            |  22 +++
 arch/arm64/include/asm/cpufeature.h           |   6 +
 arch/arm64/include/asm/el2_setup.h            |  10 +-
 arch/arm64/include/asm/hwcap.h                |   1 +
 arch/arm64/include/asm/kvm_asm.h              |   3 +-
 arch/arm64/include/asm/kvm_host.h             |   4 +
 arch/arm64/include/asm/mman.h                 |   8 +-
 arch/arm64/include/asm/mmu.h                  |   1 +
 arch/arm64/include/asm/mmu_context.h          |  51 ++++++-
 arch/arm64/include/asm/pgtable-hwdef.h        |  10 ++
 arch/arm64/include/asm/pgtable-prot.h         |   8 +-
 arch/arm64/include/asm/pgtable.h              |  34 ++++-
 arch/arm64/include/asm/pkeys.h                | 110 ++++++++++++++
 arch/arm64/include/asm/por.h                  |  33 +++++
 arch/arm64/include/asm/processor.h            |   1 +
 arch/arm64/include/asm/sysreg.h               |   3 +
 arch/arm64/include/asm/traps.h                |   1 +
 arch/arm64/include/asm/vncr_mapping.h         |   1 +
 arch/arm64/include/uapi/asm/hwcap.h           |   1 +
 arch/arm64/include/uapi/asm/sigcontext.h      |   7 +
 arch/arm64/kernel/cpufeature.c                |  23 +++
 arch/arm64/kernel/cpuinfo.c                   |   1 +
 arch/arm64/kernel/process.c                   |  28 ++++
 arch/arm64/kernel/ptrace.c                    |  46 ++++++
 arch/arm64/kernel/signal.c                    |  52 +++++++
 arch/arm64/kernel/traps.c                     |  12 +-
 arch/arm64/kvm/hyp/include/hyp/fault.h        |   5 +-
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h    |  29 ++++
 arch/arm64/kvm/sys_regs.c                     |   8 +-
 arch/arm64/mm/fault.c                         |  56 ++++++-
 arch/arm64/mm/mmap.c                          |   9 ++
 arch/arm64/mm/mmu.c                           |  40 +++++
 arch/arm64/tools/cpucaps                      |   1 +
 arch/powerpc/Kconfig                          |   4 +
 arch/x86/Kconfig                              |   4 +
 fs/proc/task_mmu.c                            |   2 +
 include/linux/mm.h                            |  20 ++-
 include/uapi/linux/elf.h                      |   1 +
 tools/testing/selftests/arm64/abi/hwcap.c     |  14 ++
 .../testing/selftests/arm64/signal/.gitignore |   1 +
 .../arm64/signal/testcases/poe_siginfo.c      |  86 +++++++++++
 .../arm64/signal/testcases/testcases.c        |  27 +---
 .../arm64/signal/testcases/testcases.h        |  28 +++-
 .../selftests/kvm/aarch64/get-reg-list.c      |  14 ++
 tools/testing/selftests/mm/Makefile           |   2 +-
 tools/testing/selftests/mm/pkey-arm64.h       | 139 ++++++++++++++++++
 tools/testing/selftests/mm/pkey-helpers.h     |   8 +
 tools/testing/selftests/mm/pkey-powerpc.h     |   3 +
 tools/testing/selftests/mm/pkey-x86.h         |   4 +
 tools/testing/selftests/mm/protection_keys.c  | 109 ++++++++++++--
 51 files changed, 1027 insertions(+), 66 deletions(-)
 create mode 100644 arch/arm64/include/asm/pkeys.h
 create mode 100644 arch/arm64/include/asm/por.h
 create mode 100644 tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
 create mode 100644 tools/testing/selftests/mm/pkey-arm64.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-06  8:57   ` Michael Ellerman
  2024-05-03 13:01 ` [PATCH v4 02/29] x86/mm: " Joey Gouly
                   ` (29 subsequent siblings)
  30 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

The new config option specifies how many bits are in each PKEY.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..6e33e4726856 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1020,6 +1020,10 @@ config PPC_MEM_KEYS
 
 	  If unsure, say y.
 
+config ARCH_PKEY_BITS
+	int
+	default 5
+
 config PPC_SECURE_BOOT
 	prompt "Enable secure boot support"
 	bool
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 02/29] x86/mm: add ARCH_PKEY_BITS to Kconfig
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 16:40   ` Dave Hansen
  2024-05-03 13:01 ` [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN Joey Gouly
                   ` (28 subsequent siblings)
  30 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

The new config option specifies how many bits are in each PKEY.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 928820e61cb5..109e767d36e7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1879,6 +1879,10 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
 
 	  If unsure, say y.
 
+config ARCH_PKEY_BITS
+	int
+	default 4
+
 choice
 	prompt "TSX enable mode"
 	depends on CPU_SUP_INTEL
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 02/29] x86/mm: " Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 16:41   ` Dave Hansen
  2024-07-15  7:53   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2 Joey Gouly
                   ` (27 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Use the new CONFIG_ARCH_PKEY_BITS to simplify setting these bits
for different architectures.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
---
 fs/proc/task_mmu.c |  2 ++
 include/linux/mm.h | 16 ++++++++++------
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 23fbab954c20..0d152f460dcc 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -692,7 +692,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_PKEY_BIT0)]	= "",
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
+#if VM_PKEY_BIT3
 		[ilog2(VM_PKEY_BIT3)]	= "",
+#endif
 #if VM_PKEY_BIT4
 		[ilog2(VM_PKEY_BIT4)]	= "",
 #endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b6bdaa18b9e9..5605b938acce 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -329,12 +329,16 @@ extern unsigned int kobjsize(const void *objp);
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
-# define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
-# define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
-# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
-# define VM_PKEY_BIT2	VM_HIGH_ARCH_2
-# define VM_PKEY_BIT3	VM_HIGH_ARCH_3
-#ifdef CONFIG_PPC
+# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
+# define VM_PKEY_BIT0  VM_HIGH_ARCH_0
+# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
+# define VM_PKEY_BIT2  VM_HIGH_ARCH_2
+#if CONFIG_ARCH_PKEY_BITS > 3
+# define VM_PKEY_BIT3  VM_HIGH_ARCH_3
+#else
+# define VM_PKEY_BIT3  0
+#endif
+#if CONFIG_ARCH_PKEY_BITS > 4
 # define VM_PKEY_BIT4  VM_HIGH_ARCH_4
 #else
 # define VM_PKEY_BIT4  0
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (2 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-07-15  7:47   ` Anshuman Khandual
  2024-07-25 15:44   ` Dave Martin
  2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
                   ` (26 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/el2_setup.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index b7afaa026842..df5614be4b70 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -184,12 +184,20 @@
 .Lset_pie_fgt_\@:
 	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
 	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
-	cbz	x1, .Lset_fgt_\@
+	cbz	x1, .Lset_poe_fgt_\@
 
 	/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
 	orr	x0, x0, #HFGxTR_EL2_nPIR_EL1
 	orr	x0, x0, #HFGxTR_EL2_nPIRE0_EL1
 
+.Lset_poe_fgt_\@:
+	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
+	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
+	cbz	x1, .Lset_fgt_\@
+
+	/* Disable trapping of POR_EL0 */
+	orr	x0, x0, #HFGxTR_EL2_nPOR_EL0
+
 .Lset_fgt_\@:
 	msr_s	SYS_HFGRTR_EL2, x0
 	msr_s	SYS_HFGWTR_EL2, x0
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (3 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2 Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 16:58   ` Catalin Marinas
                     ` (2 more replies)
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
                   ` (25 subsequent siblings)
  30 siblings, 3 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
as the boot CPU will enable POE if it has it, so secondary CPUs must also
have this feature.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/cpufeature.c | 9 +++++++++
 arch/arm64/tools/cpucaps       | 1 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 56583677c1f2..2f3c2346e156 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2861,6 +2861,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_nv1,
 		ARM64_CPUID_FIELDS_NEG(ID_AA64MMFR4_EL1, E2H0, NI_NV1)
 	},
+#ifdef CONFIG_ARM64_POE
+	{
+		.desc = "Stage-1 Permission Overlay Extension (S1POE)",
+		.capability = ARM64_HAS_S1POE,
+		.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
+		.matches = has_cpuid_feature,
+		ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
+	},
+#endif
 	{},
 };
 
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 62b2838a231a..45f558fc0d87 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -45,6 +45,7 @@ HAS_MOPS
 HAS_NESTED_VIRT
 HAS_PAN
 HAS_S1PIE
+HAS_S1POE
 HAS_RAS_EXTN
 HAS_RNG
 HAS_SB
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (4 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:03   ` Catalin Marinas
                     ` (4 more replies)
  2024-05-03 13:01 ` [PATCH v4 07/29] KVM: arm64: Save/restore POE registers Joey Gouly
                   ` (24 subsequent siblings)
  30 siblings, 5 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

POR_EL0 is a register that can be modified by userspace directly,
so it must be context switched.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/cpufeature.h |  6 ++++++
 arch/arm64/include/asm/processor.h  |  1 +
 arch/arm64/include/asm/sysreg.h     |  3 +++
 arch/arm64/kernel/process.c         | 28 ++++++++++++++++++++++++++++
 4 files changed, 38 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 8b904a757bd3..d46aab23e06e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -832,6 +832,12 @@ static inline bool system_supports_lpa2(void)
 	return cpus_have_final_cap(ARM64_HAS_LPA2);
 }
 
+static inline bool system_supports_poe(void)
+{
+	return IS_ENABLED(CONFIG_ARM64_POE) &&
+		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
+}
+
 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index f77371232d8c..e6376f979273 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -184,6 +184,7 @@ struct thread_struct {
 	u64			sctlr_user;
 	u64			svcr;
 	u64			tpidr2_el0;
+	u64			por_el0;
 };
 
 static inline unsigned int thread_get_vl(struct thread_struct *thread,
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 9e8999592f3a..62c399811dbf 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1064,6 +1064,9 @@
 #define POE_RXW		UL(0x7)
 #define POE_MASK	UL(0xf)
 
+/* Initial value for Permission Overlay Extension for EL0 */
+#define POR_EL0_INIT	POE_RXW
+
 #define ARM64_FEATURE_FIELD_BITS	4
 
 /* Defined for compatibility only, do not add new users. */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 4ae31b7af6c3..0ffaca98bed6 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -271,12 +271,23 @@ static void flush_tagged_addr_state(void)
 		clear_thread_flag(TIF_TAGGED_ADDR);
 }
 
+static void flush_poe(void)
+{
+	if (!system_supports_poe())
+		return;
+
+	write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
+	/* ISB required for kernel uaccess routines when chaning POR_EL0 */
+	isb();
+}
+
 void flush_thread(void)
 {
 	fpsimd_flush_thread();
 	tls_thread_flush();
 	flush_ptrace_hw_breakpoint(current);
 	flush_tagged_addr_state();
+	flush_poe();
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
@@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 		if (system_supports_tpidr2())
 			p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
 
+		if (system_supports_poe())
+			p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
+
 		if (stack_start) {
 			if (is_compat_thread(task_thread_info(p)))
 				childregs->compat_sp = stack_start;
@@ -495,6 +509,19 @@ static void erratum_1418040_new_exec(void)
 	preempt_enable();
 }
 
+static void permission_overlay_switch(struct task_struct *next)
+{
+	if (!system_supports_poe())
+		return;
+
+	current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
+	if (current->thread.por_el0 != next->thread.por_el0) {
+		write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
+		/* ISB required for kernel uaccess routines when chaning POR_EL0 */
+		isb();
+	}
+}
+
 /*
  * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore
  * this function must be called with preemption disabled and the update to
@@ -530,6 +557,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
 	ssbs_thread_switch(next);
 	erratum_1418040_thread_switch(next);
 	ptrauth_thread_switch_user(next);
+	permission_overlay_switch(next);
 
 	/*
 	 * Complete any pending TLB or cache maintenance on this CPU in case
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 07/29] KVM: arm64: Save/restore POE registers
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (5 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-29 15:43   ` Marc Zyngier
  2024-08-16 14:55   ` Marc Zyngier
  2024-05-03 13:01 ` [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_* Joey Gouly
                   ` (23 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Define the new system registers that POE introduces and context switch them.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/kvm_host.h          |  4 +++
 arch/arm64/include/asm/vncr_mapping.h      |  1 +
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 29 ++++++++++++++++++++++
 arch/arm64/kvm/sys_regs.c                  |  8 ++++--
 4 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9e8a496fb284..28042da0befd 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -419,6 +419,8 @@ enum vcpu_sysreg {
 	GCR_EL1,	/* Tag Control Register */
 	TFSRE0_EL1,	/* Tag Fault Status Register (EL0) */
 
+	POR_EL0,	/* Permission Overlay Register 0 (EL0) */
+
 	/* 32bit specific registers. */
 	DACR32_EL2,	/* Domain Access Control Register */
 	IFSR32_EL2,	/* Instruction Fault Status Register */
@@ -489,6 +491,8 @@ enum vcpu_sysreg {
 	VNCR(PIR_EL1),	 /* Permission Indirection Register 1 (EL1) */
 	VNCR(PIRE0_EL1), /*  Permission Indirection Register 0 (EL1) */
 
+	VNCR(POR_EL1),	/* Permission Overlay Register 1 (EL1) */
+
 	VNCR(HFGRTR_EL2),
 	VNCR(HFGWTR_EL2),
 	VNCR(HFGITR_EL2),
diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h
index df2c47c55972..06f8ec0906a6 100644
--- a/arch/arm64/include/asm/vncr_mapping.h
+++ b/arch/arm64/include/asm/vncr_mapping.h
@@ -52,6 +52,7 @@
 #define VNCR_PIRE0_EL1		0x290
 #define VNCR_PIRE0_EL2		0x298
 #define VNCR_PIR_EL1		0x2A0
+#define VNCR_POR_EL1		0x2A8
 #define VNCR_ICH_LR0_EL2        0x400
 #define VNCR_ICH_LR1_EL2        0x408
 #define VNCR_ICH_LR2_EL2        0x410
diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
index 4be6a7fa0070..1c9536557bae 100644
--- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
+++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
@@ -16,9 +16,15 @@
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 
+static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt);
+
 static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
 {
 	ctxt_sys_reg(ctxt, MDSCR_EL1)	= read_sysreg(mdscr_el1);
+
+	// POR_EL0 can affect uaccess, so must be saved/restored early.
+	if (ctxt_has_s1poe(ctxt))
+		ctxt_sys_reg(ctxt, POR_EL0)	= read_sysreg_s(SYS_POR_EL0);
 }
 
 static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
@@ -55,6 +61,17 @@ static inline bool ctxt_has_s1pie(struct kvm_cpu_context *ctxt)
 	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1PIE, IMP);
 }
 
+static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt)
+{
+	struct kvm_vcpu *vcpu;
+
+	if (!system_supports_poe())
+		return false;
+
+	vcpu = ctxt_to_vcpu(ctxt);
+	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1POE, IMP);
+}
+
 static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 {
 	ctxt_sys_reg(ctxt, SCTLR_EL1)	= read_sysreg_el1(SYS_SCTLR);
@@ -77,6 +94,10 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 		ctxt_sys_reg(ctxt, PIR_EL1)	= read_sysreg_el1(SYS_PIR);
 		ctxt_sys_reg(ctxt, PIRE0_EL1)	= read_sysreg_el1(SYS_PIRE0);
 	}
+
+	if (ctxt_has_s1poe(ctxt))
+		ctxt_sys_reg(ctxt, POR_EL1)	= read_sysreg_el1(SYS_POR);
+
 	ctxt_sys_reg(ctxt, PAR_EL1)	= read_sysreg_par();
 	ctxt_sys_reg(ctxt, TPIDR_EL1)	= read_sysreg(tpidr_el1);
 
@@ -107,6 +128,10 @@ static inline void __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
 static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
 {
 	write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1),  mdscr_el1);
+
+	// POR_EL0 can affect uaccess, so must be saved/restored early.
+	if (ctxt_has_s1poe(ctxt))
+		write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0),	SYS_POR_EL0);
 }
 
 static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
@@ -153,6 +178,10 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
 		write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1),	SYS_PIR);
 		write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1),	SYS_PIRE0);
 	}
+
+	if (ctxt_has_s1poe(ctxt))
+		write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1),	SYS_POR);
+
 	write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),	par_el1);
 	write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),	tpidr_el1);
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c9f4f387155f..be04fae35afb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2423,6 +2423,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
 	{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
 	{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
+	{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
@@ -2506,6 +2507,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	  .access = access_pmovs, .reg = PMOVSSET_EL0,
 	  .get_user = get_pmreg, .set_user = set_pmreg },
 
+	{ SYS_DESC(SYS_POR_EL0), NULL, reset_unknown, POR_EL0 },
 	{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
 	{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
 	{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
@@ -4057,8 +4059,6 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
 	kvm->arch.fgu[HFGxTR_GROUP] = (HFGxTR_EL2_nAMAIR2_EL1		|
 				       HFGxTR_EL2_nMAIR2_EL1		|
 				       HFGxTR_EL2_nS2POR_EL1		|
-				       HFGxTR_EL2_nPOR_EL1		|
-				       HFGxTR_EL2_nPOR_EL0		|
 				       HFGxTR_EL2_nACCDATA_EL1		|
 				       HFGxTR_EL2_nSMPRI_EL1_MASK	|
 				       HFGxTR_EL2_nTPIDR2_EL0_MASK);
@@ -4093,6 +4093,10 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
 		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPIRE0_EL1 |
 						HFGxTR_EL2_nPIR_EL1);
 
+	if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
+		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
+						HFGxTR_EL2_nPOR_EL0);
+
 	if (!kvm_has_feat(kvm, ID_AA64PFR0_EL1, AMU, IMP))
 		kvm->arch.fgu[HAFGRTR_GROUP] |= ~(HAFGRTR_EL2_RES0 |
 						  HAFGRTR_EL2_RES1);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_*
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (6 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 07/29] KVM: arm64: Save/restore POE registers Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-29 15:46   ` Marc Zyngier
  2024-07-15  8:36   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE Joey Gouly
                   ` (22 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

To allow using newer instructions that current assemblers don't know about,
replace the `at` instruction with the underlying SYS instruction.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/kvm_asm.h       | 3 ++-
 arch/arm64/kvm/hyp/include/hyp/fault.h | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 24b5e6b23417..ce65fd0f01b0 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -10,6 +10,7 @@
 #include <asm/hyp_image.h>
 #include <asm/insn.h>
 #include <asm/virt.h>
+#include <asm/sysreg.h>
 
 #define ARM_EXIT_WITH_SERROR_BIT  31
 #define ARM_EXCEPTION_CODE(x)	  ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
@@ -261,7 +262,7 @@ extern u64 __kvm_get_mdcr_el2(void);
 	asm volatile(							\
 	"	mrs	%1, spsr_el2\n"					\
 	"	mrs	%2, elr_el2\n"					\
-	"1:	at	"at_op", %3\n"					\
+	"1:	" __msr_s(at_op, "%3") "\n"				\
 	"	isb\n"							\
 	"	b	9f\n"						\
 	"2:	msr	spsr_el2, %1\n"					\
diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
index 9e13c1bc2ad5..487c06099d6f 100644
--- a/arch/arm64/kvm/hyp/include/hyp/fault.h
+++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
@@ -27,7 +27,7 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	 * saved the guest context yet, and we may return early...
 	 */
 	par = read_sysreg_par();
-	if (!__kvm_at("s1e1r", far))
+	if (!__kvm_at(OP_AT_S1E1R, far))
 		tmp = read_sysreg_par();
 	else
 		tmp = SYS_PAR_EL1_F; /* back to the guest */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (7 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_* Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-29 15:50   ` Marc Zyngier
  2024-07-15  8:45   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
                   ` (21 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

FEAT_ATS1E1A introduces a new instruction: `at s1e1a`.
This is an address translation, without permission checks.

POE allows read permissions to be removed from S1 by the guest.  This means
that an `at` instruction could fail, and not get the IPA.

Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1
permissions.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/hyp/include/hyp/fault.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
index 487c06099d6f..17df94570f03 100644
--- a/arch/arm64/kvm/hyp/include/hyp/fault.h
+++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
@@ -14,6 +14,7 @@
 
 static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
 {
+	int ret;
 	u64 par, tmp;
 
 	/*
@@ -27,7 +28,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	 * saved the guest context yet, and we may return early...
 	 */
 	par = read_sysreg_par();
-	if (!__kvm_at(OP_AT_S1E1R, far))
+	ret = system_supports_poe() ? __kvm_at(OP_AT_S1E1A, far) :
+	                              __kvm_at(OP_AT_S1E1R, far);
+	if (!ret)
 		tmp = read_sysreg_par();
 	else
 		tmp = SYS_PAR_EL1_F; /* back to the guest */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (8 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:04   ` Catalin Marinas
                     ` (3 more replies)
  2024-05-03 13:01 ` [PATCH v4 11/29] arm64: re-order MTE VM_ flags Joey Gouly
                   ` (20 subsequent siblings)
  30 siblings, 4 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
check if the CPU supports the feature.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

This takes the last bit of HWCAP2, is this fine? What can we do about more features in the future?


 Documentation/arch/arm64/elf_hwcaps.rst |  2 ++
 arch/arm64/include/asm/hwcap.h          |  1 +
 arch/arm64/include/uapi/asm/hwcap.h     |  1 +
 arch/arm64/kernel/cpufeature.c          | 14 ++++++++++++++
 arch/arm64/kernel/cpuinfo.c             |  1 +
 5 files changed, 19 insertions(+)

diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
index 448c1664879b..694f67fa07d1 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
 HWCAP2_SME_SF8DP4
     Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
 
+HWCAP2_POE
+    Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
 
 4. Unused AT_HWCAP bits
 -----------------------
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 4edd3b61df11..a775adddecf2 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -157,6 +157,7 @@
 #define KERNEL_HWCAP_SME_SF8FMA		__khwcap2_feature(SME_SF8FMA)
 #define KERNEL_HWCAP_SME_SF8DP4		__khwcap2_feature(SME_SF8DP4)
 #define KERNEL_HWCAP_SME_SF8DP2		__khwcap2_feature(SME_SF8DP2)
+#define KERNEL_HWCAP_POE		__khwcap2_feature(POE)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 285610e626f5..055381b2c615 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -122,5 +122,6 @@
 #define HWCAP2_SME_SF8FMA	(1UL << 60)
 #define HWCAP2_SME_SF8DP4	(1UL << 61)
 #define HWCAP2_SME_SF8DP2	(1UL << 62)
+#define HWCAP2_POE		(1UL << 63)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 2f3c2346e156..8c02aae9db11 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -465,6 +465,8 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = {
+	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_POE),
+		       FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0),
 	ARM64_FTR_END,
@@ -2339,6 +2341,14 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused)
 	sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn);
 }
 
+#ifdef CONFIG_ARM64_POE
+static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
+{
+	sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
+	sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
+}
+#endif
+
 /* Internal helper functions to match cpu capability type */
 static bool
 cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@@ -2867,6 +2877,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.capability = ARM64_HAS_S1POE,
 		.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
 		.matches = has_cpuid_feature,
+		.cpu_enable = cpu_enable_poe,
 		ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
 	},
 #endif
@@ -3034,6 +3045,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2),
 	HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3),
 	HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2),
+#ifdef CONFIG_ARM64_POE
+	HWCAP_CAP(ID_AA64MMFR3_EL1, S1POE, IMP, CAP_HWCAP, KERNEL_HWCAP_POE),
+#endif
 	{},
 };
 
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 09eeaa24d456..b9db812082b3 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -143,6 +143,7 @@ static const char *const hwcap_str[] = {
 	[KERNEL_HWCAP_SME_SF8FMA]	= "smesf8fma",
 	[KERNEL_HWCAP_SME_SF8DP4]	= "smesf8dp4",
 	[KERNEL_HWCAP_SME_SF8DP2]	= "smesf8dp2",
+	[KERNEL_HWCAP_POE]		= "poe",
 };
 
 #ifdef CONFIG_COMPAT
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 11/29] arm64: re-order MTE VM_ flags
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (9 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:04   ` Catalin Marinas
  2024-07-15  9:21   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 12/29] arm64: add POIndex defines Joey Gouly
                   ` (19 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

To make it easier to share the generic PKEYs flags, move the MTE flag.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 include/linux/mm.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5605b938acce..2065727b3787 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -377,8 +377,8 @@ extern unsigned int kobjsize(const void *objp);
 #endif
 
 #if defined(CONFIG_ARM64_MTE)
-# define VM_MTE		VM_HIGH_ARCH_0	/* Use Tagged memory for access control */
-# define VM_MTE_ALLOWED	VM_HIGH_ARCH_1	/* Tagged memory permitted */
+# define VM_MTE		VM_HIGH_ARCH_4	/* Use Tagged memory for access control */
+# define VM_MTE_ALLOWED	VM_HIGH_ARCH_5	/* Tagged memory permitted */
 #else
 # define VM_MTE		VM_NONE
 # define VM_MTE_ALLOWED	VM_NONE
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 12/29] arm64: add POIndex defines
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (10 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 11/29] arm64: re-order MTE VM_ flags Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:05   ` Catalin Marinas
  2024-07-15  9:26   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
                   ` (18 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

The 3-bit POIndex is stored in the PTE at bits 60..62.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index ef207a0d4f0d..370a02922fe1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -198,6 +198,16 @@
 #define PTE_PI_IDX_2	53	/* PXN */
 #define PTE_PI_IDX_3	54	/* UXN */
 
+/*
+ * POIndex[2:0] encoding (Permission Overlay Extension)
+ */
+#define PTE_PO_IDX_0	(_AT(pteval_t, 1) << 60)
+#define PTE_PO_IDX_1	(_AT(pteval_t, 1) << 61)
+#define PTE_PO_IDX_2	(_AT(pteval_t, 1) << 62)
+
+#define PTE_PO_IDX_MASK		GENMASK_ULL(62, 60)
+
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (11 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 12/29] arm64: add POIndex defines Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-28  6:54   ` Amit Daniel Kachhap
                     ` (2 more replies)
  2024-05-03 13:01 ` [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE Joey Gouly
                   ` (17 subsequent siblings)
  30 siblings, 3 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
value is set in the vm_flags and then into the pgprot value.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mman.h | 8 +++++++-
 arch/arm64/mm/mmap.c          | 9 +++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index 5966ee4a6154..ecb2d18dc4d7 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -7,7 +7,7 @@
 #include <uapi/asm/mman.h>
 
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
-	unsigned long pkey __always_unused)
+	unsigned long pkey)
 {
 	unsigned long ret = 0;
 
@@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 	if (system_supports_mte() && (prot & PROT_MTE))
 		ret |= VM_MTE;
 
+#if defined(CONFIG_ARCH_HAS_PKEYS)
+	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
+	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
+	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
+#endif
+
 	return ret;
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index 642bdf908b22..86eda6bc7893 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
 	if (vm_flags & VM_MTE)
 		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+	if (vm_flags & VM_PKEY_BIT0)
+		prot |= PTE_PO_IDX_0;
+	if (vm_flags & VM_PKEY_BIT1)
+		prot |= PTE_PO_IDX_1;
+	if (vm_flags & VM_PKEY_BIT2)
+		prot |= PTE_PO_IDX_2;
+#endif
+
 	return __pgprot(prot);
 }
 EXPORT_SYMBOL(vm_get_page_prot);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (12 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-07-16  9:10   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
                   ` (16 subsequent siblings)
  30 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

When a PTE is modified, the POIndex must be masked off so that it can be modified.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index afdd56d26ad7..5c970a9cca67 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1028,7 +1028,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 	 */
 	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
 			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP |
-			      PTE_ATTRINDX_MASK;
+			      PTE_ATTRINDX_MASK | PTE_PO_IDX_MASK;
+
 	/* preserve the hardware dirty information */
 	if (pte_hw_dirty(pte))
 		pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (13 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 16:57   ` Catalin Marinas
                     ` (3 more replies)
  2024-05-03 13:01 ` [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay() Joey Gouly
                   ` (15 subsequent siblings)
  30 siblings, 4 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

If a memory fault occurs that is due to an overlay/pkey fault, report that to
userspace with a SEGV_PKUERR.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/traps.h |  1 +
 arch/arm64/kernel/traps.c      | 12 ++++++--
 arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
 3 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index eefe766d6161..f6f6f2cb7f10 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
 void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
 void arm64_notify_segfault(unsigned long addr);
 void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
+void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
 void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 215e6d7f2df8..1bac6c84d3f5 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
 	__show_regs(regs);
 }
 
-void arm64_force_sig_fault(int signo, int code, unsigned long far,
-			   const char *str)
+void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
+			   const char *str, int pkey)
 {
 	arm64_show_signal(signo, str);
 	if (signo == SIGKILL)
 		force_sig(SIGKILL);
+	else if (code == SEGV_PKUERR)
+		force_sig_pkuerr((void __user *)far, pkey);
 	else
 		force_sig_fault(signo, code, (void __user *)far);
 }
 
+void arm64_force_sig_fault(int signo, int code, unsigned long far,
+			   const char *str)
+{
+	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
+}
+
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
 			    const char *str)
 {
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 8251e2fea9c7..585295168918 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -23,6 +23,7 @@
 #include <linux/sched/debug.h>
 #include <linux/highmem.h>
 #include <linux/perf_event.h>
+#include <linux/pkeys.h>
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
@@ -489,6 +490,23 @@ static void do_bad_area(unsigned long far, unsigned long esr,
 #define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
 #define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
 
+static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
+			unsigned int mm_flags)
+{
+	unsigned long iss2 = ESR_ELx_ISS2(esr);
+
+	if (!arch_pkeys_enabled())
+		return false;
+
+	if (iss2 & ESR_ELx_Overlay)
+		return true;
+
+	return !arch_vma_access_permitted(vma,
+			mm_flags & FAULT_FLAG_WRITE,
+			mm_flags & FAULT_FLAG_INSTRUCTION,
+			mm_flags & FAULT_FLAG_REMOTE);
+}
+
 static vm_fault_t __do_page_fault(struct mm_struct *mm,
 				  struct vm_area_struct *vma, unsigned long addr,
 				  unsigned int mm_flags, unsigned long vm_flags,
@@ -529,6 +547,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
 	unsigned long addr = untagged_addr(far);
 	struct vm_area_struct *vma;
+	bool pkey_fault = false;
+	int pkey = -1;
 
 	if (kprobe_page_fault(regs, esr))
 		return 0;
@@ -590,6 +610,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		vma_end_read(vma);
 		goto lock_mmap;
 	}
+
+	if (fault_from_pkey(esr, vma, mm_flags)) {
+		vma_end_read(vma);
+		goto lock_mmap;
+	}
+
 	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
 		vma_end_read(vma);
@@ -617,6 +643,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		goto done;
 	}
 
+	if (fault_from_pkey(esr, vma, mm_flags)) {
+		pkey_fault = true;
+		pkey = vma_pkey(vma);
+	}
+
 	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
 
 	/* Quick path to respond to signals */
@@ -682,9 +713,28 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		 * Something tried to access memory that isn't in our memory
 		 * map.
 		 */
-		arm64_force_sig_fault(SIGSEGV,
-				      fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
-				      far, inf->name);
+		int fault_kind;
+		/*
+		 * The pkey value that we return to userspace can be different
+		 * from the pkey that caused the fault.
+		 *
+		 * 1. T1   : mprotect_key(foo, PAGE_SIZE, pkey=4);
+		 * 2. T1   : set POR_EL0 to deny access to pkey=4, touches, page
+		 * 3. T1   : faults...
+		 * 4.    T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
+		 * 5. T1   : enters fault handler, takes mmap_lock, etc...
+		 * 6. T1   : reaches here, sees vma_pkey(vma)=5, when we really
+		 *	     faulted on a pte with its pkey=4.
+		 */
+
+		if (pkey_fault)
+			fault_kind = SEGV_PKUERR;
+		else
+			fault_kind = fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR;
+
+		arm64_force_sig_fault_pkey(SIGSEGV,
+				      fault_kind,
+				      far, inf->name, pkey);
 	}
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay()
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (14 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:15   ` Catalin Marinas
  2024-07-16 10:21   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
                   ` (14 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

We do not want take POE into account when clearing the MTE tags.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/pgtable.h | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5c970a9cca67..2449e4e27ea6 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -160,8 +160,10 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
  * not set) must return false. PROT_NONE mappings do not have the
  * PTE_VALID bit set.
  */
-#define pte_access_permitted(pte, write) \
+#define pte_access_permitted_no_overlay(pte, write) \
 	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
+#define pte_access_permitted(pte, write) \
+	pte_access_permitted_no_overlay(pte, write)
 #define pmd_access_permitted(pmd, write) \
 	(pte_access_permitted(pmd_pte(pmd), (write)))
 #define pud_access_permitted(pud, write) \
@@ -348,10 +350,11 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages)
 	/*
 	 * If the PTE would provide user space access to the tags associated
 	 * with it then ensure that the MTE tags are synchronised.  Although
-	 * pte_access_permitted() returns false for exec only mappings, they
-	 * don't expose tags (instruction fetches don't check tags).
+	 * pte_access_permitted_no_overlay() returns false for exec only
+	 * mappings, they don't expose tags (instruction fetches don't check
+	 * tags).
 	 */
-	if (system_supports_mte() && pte_access_permitted(pte, false) &&
+	if (system_supports_mte() && pte_access_permitted_no_overlay(pte, false) &&
 	    !pte_special(pte) && pte_tagged(pte))
 		mte_sync_tags(pte, nr_pages);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (15 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay() Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-28  6:55   ` Amit Daniel Kachhap
                     ` (5 more replies)
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
                   ` (13 subsequent siblings)
  30 siblings, 6 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Implement the PKEYS interface, using the Permission Overlay Extension.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mmu.h         |   1 +
 arch/arm64/include/asm/mmu_context.h |  51 ++++++++++++-
 arch/arm64/include/asm/pgtable.h     |  22 +++++-
 arch/arm64/include/asm/pkeys.h       | 110 +++++++++++++++++++++++++++
 arch/arm64/include/asm/por.h         |  33 ++++++++
 arch/arm64/mm/mmu.c                  |  40 ++++++++++
 6 files changed, 255 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/include/asm/pkeys.h
 create mode 100644 arch/arm64/include/asm/por.h

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 65977c7783c5..983afeb4eba5 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -25,6 +25,7 @@ typedef struct {
 	refcount_t	pinned;
 	void		*vdso;
 	unsigned long	flags;
+	u8		pkey_allocation_map;
 } mm_context_t;
 
 /*
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index c768d16b81a4..cb499db7a97b 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -15,12 +15,12 @@
 #include <linux/sched/hotplug.h>
 #include <linux/mm_types.h>
 #include <linux/pgtable.h>
+#include <linux/pkeys.h>
 
 #include <asm/cacheflush.h>
 #include <asm/cpufeature.h>
 #include <asm/daifflags.h>
 #include <asm/proc-fns.h>
-#include <asm-generic/mm_hooks.h>
 #include <asm/cputype.h>
 #include <asm/sysreg.h>
 #include <asm/tlbflush.h>
@@ -175,9 +175,36 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 {
 	atomic64_set(&mm->context.id, 0);
 	refcount_set(&mm->context.pinned, 0);
+
+	/* pkey 0 is the default, so always reserve it. */
+	mm->context.pkey_allocation_map = 0x1;
+
+	return 0;
+}
+
+static inline void arch_dup_pkeys(struct mm_struct *oldmm,
+				  struct mm_struct *mm)
+{
+	/* Duplicate the oldmm pkey state in mm: */
+	mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
+}
+
+static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
+{
+	arch_dup_pkeys(oldmm, mm);
+
 	return 0;
 }
 
+static inline void arch_exit_mmap(struct mm_struct *mm)
+{
+}
+
+static inline void arch_unmap(struct mm_struct *mm,
+			unsigned long start, unsigned long end)
+{
+}
+
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 static inline void update_saved_ttbr0(struct task_struct *tsk,
 				      struct mm_struct *mm)
@@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
 	return -1UL >> 8;
 }
 
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to POR_EL0 for other
+ * processes or any way to tell *which * POR_EL0 in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool execute, bool foreign)
+{
+	if (!arch_pkeys_enabled())
+		return true;
+
+	/* allow access if the VMA is not one from this process */
+	if (foreign || vma_is_foreign(vma))
+		return true;
+
+	return por_el0_allows_pkey(vma_pkey(vma), write, execute);
+}
+
 #include <asm-generic/mmu_context.h>
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2449e4e27ea6..8ee68ff03016 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -34,6 +34,7 @@
 
 #include <asm/cmpxchg.h>
 #include <asm/fixmap.h>
+#include <asm/por.h>
 #include <linux/mmdebug.h>
 #include <linux/mm_types.h>
 #include <linux/sched.h>
@@ -153,6 +154,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
 #define pte_accessible(mm, pte)	\
 	(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
 
+static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
+{
+	u64 por;
+
+	if (!system_supports_poe())
+		return true;
+
+	por = read_sysreg_s(SYS_POR_EL0);
+
+	if (write)
+		return por_elx_allows_write(por, pkey);
+
+	if (execute)
+		return por_elx_allows_exec(por, pkey);
+
+	return por_elx_allows_read(por, pkey);
+}
+
 /*
  * p??_access_permitted() is true for valid user mappings (PTE_USER
  * bit set, subject to the write permission check). For execute-only
@@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
 #define pte_access_permitted_no_overlay(pte, write) \
 	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
 #define pte_access_permitted(pte, write) \
-	pte_access_permitted_no_overlay(pte, write)
+	(pte_access_permitted_no_overlay(pte, write) && \
+	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
 #define pmd_access_permitted(pmd, write) \
 	(pte_access_permitted(pmd_pte(pmd), (write)))
 #define pud_access_permitted(pud, write) \
diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
new file mode 100644
index 000000000000..a284508a4d02
--- /dev/null
+++ b/arch/arm64/include/asm/pkeys.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Arm Ltd.
+ *
+ * Based on arch/x86/include/asm/pkeys.h
+ */
+
+#ifndef _ASM_ARM64_PKEYS_H
+#define _ASM_ARM64_PKEYS_H
+
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
+
+#define arch_max_pkey() 7
+
+int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+		unsigned long init_val);
+
+static inline bool arch_pkeys_enabled(void)
+{
+	return false;
+}
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+		int prot, int pkey)
+{
+	if (pkey != -1)
+		return pkey;
+
+	return vma_pkey(vma);
+}
+
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+	// Execute-only mappings are handled by EPAN/FEAT_PAN3.
+	WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN));
+
+	return -1;
+}
+
+#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)
+#define mm_set_pkey_allocated(mm, pkey) do {		\
+	mm_pkey_allocation_map(mm) |= (1U << pkey);	\
+} while (0)
+#define mm_set_pkey_free(mm, pkey) do {			\
+	mm_pkey_allocation_map(mm) &= ~(1U << pkey);	\
+} while (0)
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+	/*
+	 * "Allocated" pkeys are those that have been returned
+	 * from pkey_alloc() or pkey 0 which is allocated
+	 * implicitly when the mm is created.
+	 */
+	if (pkey < 0)
+		return false;
+	if (pkey >= arch_max_pkey())
+		return false;
+
+	return mm_pkey_allocation_map(mm) & (1U << pkey);
+}
+
+/*
+ * Returns a positive, 3-bit key on success, or -1 on failure.
+ */
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+	/*
+	 * Note: this is the one and only place we make sure
+	 * that the pkey is valid as far as the hardware is
+	 * concerned.  The rest of the kernel trusts that
+	 * only good, valid pkeys come out of here.
+	 */
+	u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1);
+	int ret;
+
+	if (!arch_pkeys_enabled())
+		return -1;
+
+	/*
+	 * Are we out of pkeys?  We must handle this specially
+	 * because ffz() behavior is undefined if there are no
+	 * zeros.
+	 */
+	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+		return -1;
+
+	ret = ffz(mm_pkey_allocation_map(mm));
+
+	mm_set_pkey_allocated(mm, ret);
+
+	return ret;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+	if (!mm_pkey_is_allocated(mm, pkey))
+		return -EINVAL;
+
+	mm_set_pkey_free(mm, pkey);
+
+	return 0;
+}
+
+#endif /* _ASM_ARM64_PKEYS_H */
diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
new file mode 100644
index 000000000000..d6604e0c5c54
--- /dev/null
+++ b/arch/arm64/include/asm/por.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Arm Ltd.
+ */
+
+#ifndef _ASM_ARM64_POR_H
+#define _ASM_ARM64_POR_H
+
+#define POR_BITS_PER_PKEY		4
+#define POR_ELx_IDX(por_elx, idx)	(((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf)
+
+static inline bool por_elx_allows_read(u64 por, u8 pkey)
+{
+	u8 perm = POR_ELx_IDX(por, pkey);
+
+	return perm & POE_R;
+}
+
+static inline bool por_elx_allows_write(u64 por, u8 pkey)
+{
+	u8 perm = POR_ELx_IDX(por, pkey);
+
+	return perm & POE_W;
+}
+
+static inline bool por_elx_allows_exec(u64 por, u8 pkey)
+{
+	u8 perm = POR_ELx_IDX(por, pkey);
+
+	return perm & POE_X;
+}
+
+#endif /* _ASM_ARM64_POR_H */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 495b732d5af3..e50ccc86d150 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -25,6 +25,7 @@
 #include <linux/vmalloc.h>
 #include <linux/set_memory.h>
 #include <linux/kfence.h>
+#include <linux/pkeys.h>
 
 #include <asm/barrier.h>
 #include <asm/cputype.h>
@@ -1535,3 +1536,42 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
 
 	cpu_uninstall_idmap();
 }
+
+#ifdef CONFIG_ARCH_HAS_PKEYS
+int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
+{
+	u64 new_por = POE_RXW;
+	u64 old_por;
+	u64 pkey_shift;
+
+	if (!arch_pkeys_enabled())
+		return -ENOSPC;
+
+	/*
+	 * This code should only be called with valid 'pkey'
+	 * values originating from in-kernel users.  Complain
+	 * if a bad value is observed.
+	 */
+	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
+		return -EINVAL;
+
+	/* Set the bits we need in POR:  */
+	if (init_val & PKEY_DISABLE_ACCESS)
+		new_por = POE_X;
+	else if (init_val & PKEY_DISABLE_WRITE)
+		new_por = POE_RX;
+
+	/* Shift the bits in to the correct place in POR for pkey: */
+	pkey_shift = pkey * POR_BITS_PER_PKEY;
+	new_por <<= pkey_shift;
+
+	/* Get old POR and mask off any old bits in place: */
+	old_por = read_sysreg_s(SYS_POR_EL0);
+	old_por &= ~(POE_MASK << pkey_shift);
+
+	/* Write old part along with new part: */
+	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
+
+	return 0;
+}
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (16 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-28  6:56   ` Amit Daniel Kachhap
                     ` (4 more replies)
  2024-05-03 13:01 ` [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE Joey Gouly
                   ` (12 subsequent siblings)
  30 siblings, 5 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
---
 arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
 arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
index 8a45b7a411e0..e4cba8a6c9a2 100644
--- a/arch/arm64/include/uapi/asm/sigcontext.h
+++ b/arch/arm64/include/uapi/asm/sigcontext.h
@@ -98,6 +98,13 @@ struct esr_context {
 	__u64 esr;
 };
 
+#define POE_MAGIC	0x504f4530
+
+struct poe_context {
+	struct _aarch64_ctx head;
+	__u64 por_el0;
+};
+
 /*
  * extra_context: describes extra space in the signal frame for
  * additional structures that don't fit in sigcontext.__reserved[].
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 4a77f4976e11..077436a8bc10 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -63,6 +63,7 @@ struct rt_sigframe_user_layout {
 	unsigned long fpmr_offset;
 	unsigned long extra_offset;
 	unsigned long end_offset;
+	unsigned long poe_offset;
 };
 
 #define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
@@ -185,6 +186,8 @@ struct user_ctxs {
 	u32 zt_size;
 	struct fpmr_context __user *fpmr;
 	u32 fpmr_size;
+	struct poe_context __user *poe;
+	u32 poe_size;
 };
 
 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
@@ -258,6 +261,21 @@ static int restore_fpmr_context(struct user_ctxs *user)
 	return err;
 }
 
+static int restore_poe_context(struct user_ctxs *user)
+{
+	u64 por_el0;
+	int err = 0;
+
+	if (user->poe_size != sizeof(*user->poe))
+		return -EINVAL;
+
+	__get_user_error(por_el0, &(user->poe->por_el0), err);
+	if (!err)
+		write_sysreg_s(por_el0, SYS_POR_EL0);
+
+	return err;
+}
+
 #ifdef CONFIG_ARM64_SVE
 
 static int preserve_sve_context(struct sve_context __user *ctx)
@@ -621,6 +639,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
 	user->za = NULL;
 	user->zt = NULL;
 	user->fpmr = NULL;
+	user->poe = NULL;
 
 	if (!IS_ALIGNED((unsigned long)base, 16))
 		goto invalid;
@@ -671,6 +690,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
 			/* ignore */
 			break;
 
+		case POE_MAGIC:
+			if (!system_supports_poe())
+				goto invalid;
+
+			if (user->poe)
+				goto invalid;
+
+			user->poe = (struct poe_context __user *)head;
+			user->poe_size = size;
+			break;
+
 		case SVE_MAGIC:
 			if (!system_supports_sve() && !system_supports_sme())
 				goto invalid;
@@ -857,6 +887,9 @@ static int restore_sigframe(struct pt_regs *regs,
 	if (err == 0 && system_supports_sme2() && user.zt)
 		err = restore_zt_context(&user);
 
+	if (err == 0 && system_supports_poe() && user.poe)
+		err = restore_poe_context(&user);
+
 	return err;
 }
 
@@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
 			return err;
 	}
 
+	if (system_supports_poe()) {
+		err = sigframe_alloc(user, &user->poe_offset,
+				     sizeof(struct poe_context));
+		if (err)
+			return err;
+	}
+
 	return sigframe_alloc_end(user);
 }
 
@@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
 		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
 	}
 
+	if (system_supports_poe() && err == 0 && user->poe_offset) {
+		struct poe_context __user *poe_ctx =
+			apply_user_offset(user, user->poe_offset);
+
+		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
+		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
+		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
+	}
+
 	/* Scalable Vector Extension state (including streaming), if present */
 	if ((system_supports_sve() || system_supports_sme()) &&
 	    err == 0 && user->sve_offset) {
@@ -1178,6 +1227,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 		sme_smstop();
 	}
 
+	if (system_supports_poe())
+		write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
+
 	if (ka->sa.sa_flags & SA_RESTORER)
 		sigtramp = ka->sa.sa_restorer;
 	else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (17 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-07-16 10:47   ` Anshuman Khandual
  2024-07-25 16:00   ` Dave Martin
  2024-05-03 13:01 ` [PATCH v4 20/29] arm64: enable POE and PIE to coexist Joey Gouly
                   ` (11 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Now that PKEYs support has been implemented, enable it for CPUs that
support S1POE.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/pkeys.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
index a284508a4d02..3ea928ec94c0 100644
--- a/arch/arm64/include/asm/pkeys.h
+++ b/arch/arm64/include/asm/pkeys.h
@@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 
 static inline bool arch_pkeys_enabled(void)
 {
-	return false;
+	return system_supports_poe();
 }
 
 static inline int vma_pkey(struct vm_area_struct *vma)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 20/29] arm64: enable POE and PIE to coexist
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (18 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-06-21 17:16   ` Catalin Marinas
  2024-07-16 10:41   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE Joey Gouly
                   ` (10 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Set the EL0/userspace indirection encodings to be the overlay enabled
variants of the permissions.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/pgtable-prot.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index dd9ee67d1d87..4f9f85437d3d 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -147,10 +147,10 @@ static inline bool __pure lpa2_is_enabled(void)
 
 #define PIE_E0	( \
 	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_X_O) | \
-	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX)  | \
-	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX) | \
-	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R)   | \
-	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW))
+	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O)  | \
+	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX_O) | \
+	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R_O)   | \
+	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW_O))
 
 #define PIE_E1	( \
 	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_NONE_O) | \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (19 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 20/29] arm64: enable POE and PIE to coexist Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-07-16 10:35   ` Anshuman Khandual
  2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
                   ` (9 subsequent siblings)
  30 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Add a regset for POE containing POR_EL0.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/kernel/ptrace.c | 46 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/elf.h   |  1 +
 2 files changed, 47 insertions(+)

diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 0d022599eb61..b756578aeaee 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1440,6 +1440,39 @@ static int tagged_addr_ctrl_set(struct task_struct *target, const struct
 }
 #endif
 
+#ifdef CONFIG_ARM64_POE
+static int poe_get(struct task_struct *target,
+		   const struct user_regset *regset,
+		   struct membuf to)
+{
+	if (!system_supports_poe())
+		return -EINVAL;
+
+	return membuf_write(&to, &target->thread.por_el0,
+			    sizeof(target->thread.por_el0));
+}
+
+static int poe_set(struct task_struct *target, const struct
+		   user_regset *regset, unsigned int pos,
+		   unsigned int count, const void *kbuf, const
+		   void __user *ubuf)
+{
+	int ret;
+	long ctrl;
+
+	if (!system_supports_poe())
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1);
+	if (ret)
+		return ret;
+
+	target->thread.por_el0 = ctrl;
+
+	return 0;
+}
+#endif
+
 enum aarch64_regset {
 	REGSET_GPR,
 	REGSET_FPR,
@@ -1469,6 +1502,9 @@ enum aarch64_regset {
 #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
 	REGSET_TAGGED_ADDR_CTRL,
 #endif
+#ifdef CONFIG_ARM64_POE
+	REGSET_POE
+#endif
 };
 
 static const struct user_regset aarch64_regsets[] = {
@@ -1628,6 +1664,16 @@ static const struct user_regset aarch64_regsets[] = {
 		.set = tagged_addr_ctrl_set,
 	},
 #endif
+#ifdef CONFIG_ARM64_POE
+	[REGSET_POE] = {
+		.core_note_type = NT_ARM_POE,
+		.n = 1,
+		.size = sizeof(long),
+		.align = sizeof(long),
+		.regset_get = poe_get,
+		.set = poe_set,
+	},
+#endif
 };
 
 static const struct user_regset_view user_aarch64_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index b54b313bcf07..81762ff3c99e 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -441,6 +441,7 @@ typedef struct elf64_shdr {
 #define NT_ARM_ZA	0x40c		/* ARM SME ZA registers */
 #define NT_ARM_ZT	0x40d		/* ARM SME ZT registers */
 #define NT_ARM_FPMR	0x40e		/* ARM floating point mode register */
+#define NT_ARM_POE	0x40f		/* ARM POE registers */
 #define NT_ARC_V2	0x600		/* ARCv2 accumulator/extra registers */
 #define NT_VMCOREDD	0x700		/* Vmcore Device Dump Note */
 #define NT_MIPS_DSP	0x800		/* MIPS DSP ASE registers */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (20 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-07-05 17:05   ` Catalin Marinas
                     ` (2 more replies)
  2024-05-03 13:01 ` [PATCH v4 23/29] kselftest/arm64: move get_header() Joey Gouly
                   ` (8 subsequent siblings)
  30 siblings, 3 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Now that support for POE and Protection Keys has been implemented, add a
config to allow users to actually enable it.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/Kconfig | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b11c98b3e84..676ebe4bf9eb 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2095,6 +2095,28 @@ config ARM64_EPAN
 	  if the cpu does not implement the feature.
 endmenu # "ARMv8.7 architectural features"
 
+menu "ARMv8.9 architectural features"
+config ARM64_POE
+	prompt "Permission Overlay Extension"
+	def_bool y
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_HAS_PKEYS
+	help
+	  The Permission Overlay Extension is used to implement Memory
+	  Protection Keys. Memory Protection Keys provides a mechanism for
+	  enforcing page-based protections, but without requiring modification
+	  of the page tables when an application changes protection domains.
+
+	  For details, see Documentation/core-api/protection-keys.rst
+
+	  If unsure, say y.
+
+config ARCH_PKEY_BITS
+	int
+	default 3
+
+endmenu # "ARMv8.9 architectural features"
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 23/29] kselftest/arm64: move get_header()
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (21 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 24/29] selftests: mm: move fpregs printing Joey Gouly
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Put this function in the header so that it can be used by other tests, without
needing to link to testcases.c.

This will be used by selftest/mm/protection_keys.c

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 .../arm64/signal/testcases/testcases.c        | 23 -----------------
 .../arm64/signal/testcases/testcases.h        | 25 +++++++++++++++++--
 2 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c b/tools/testing/selftests/arm64/signal/testcases/testcases.c
index 674b88cc8c39..e4331440fed0 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.c
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c
@@ -6,29 +6,6 @@
 
 #include "testcases.h"
 
-struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic,
-				size_t resv_sz, size_t *offset)
-{
-	size_t offs = 0;
-	struct _aarch64_ctx *found = NULL;
-
-	if (!head || resv_sz < HDR_SZ)
-		return found;
-
-	while (offs <= resv_sz - HDR_SZ &&
-	       head->magic != magic && head->magic) {
-		offs += head->size;
-		head = GET_RESV_NEXT_HEAD(head);
-	}
-	if (head->magic == magic) {
-		found = head;
-		if (offset)
-			*offset = offs;
-	}
-
-	return found;
-}
-
 bool validate_extra_context(struct extra_context *extra, char **err,
 			    void **extra_data, size_t *extra_size)
 {
diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h
index 7727126347e0..3185e6875694 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.h
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h
@@ -88,8 +88,29 @@ struct fake_sigframe {
 
 bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err);
 
-struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic,
-				size_t resv_sz, size_t *offset);
+static inline struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic,
+				size_t resv_sz, size_t *offset)
+{
+	size_t offs = 0;
+	struct _aarch64_ctx *found = NULL;
+
+	if (!head || resv_sz < HDR_SZ)
+		return found;
+
+	while (offs <= resv_sz - HDR_SZ &&
+	       head->magic != magic && head->magic) {
+		offs += head->size;
+		head = GET_RESV_NEXT_HEAD(head);
+	}
+	if (head->magic == magic) {
+		found = head;
+		if (offset)
+			*offset = offs;
+	}
+
+	return found;
+}
+
 
 static inline struct _aarch64_ctx *get_terminator(struct _aarch64_ctx *head,
 						  size_t resv_sz,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 24/29] selftests: mm: move fpregs printing
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (22 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 23/29] kselftest/arm64: move get_header() Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 25/29] selftests: mm: make protection_keys test work on arm64 Joey Gouly
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

arm64's fpregs are not at a constant offset from sigcontext. Since this is
not an important part of the test, don't print the fpregs pointer on arm64.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
 tools/testing/selftests/mm/pkey-powerpc.h    | 1 +
 tools/testing/selftests/mm/pkey-x86.h        | 2 ++
 tools/testing/selftests/mm/protection_keys.c | 6 ++++++
 3 files changed, 9 insertions(+)

diff --git a/tools/testing/selftests/mm/pkey-powerpc.h b/tools/testing/selftests/mm/pkey-powerpc.h
index ae5df26104e5..6275d0f474b3 100644
--- a/tools/testing/selftests/mm/pkey-powerpc.h
+++ b/tools/testing/selftests/mm/pkey-powerpc.h
@@ -9,6 +9,7 @@
 #endif
 #define REG_IP_IDX		PT_NIP
 #define REG_TRAPNO		PT_TRAP
+#define MCONTEXT_FPREGS
 #define gregs			gp_regs
 #define fpregs			fp_regs
 #define si_pkey_offset		0x20
diff --git a/tools/testing/selftests/mm/pkey-x86.h b/tools/testing/selftests/mm/pkey-x86.h
index 814758e109c0..b9170a26bfcb 100644
--- a/tools/testing/selftests/mm/pkey-x86.h
+++ b/tools/testing/selftests/mm/pkey-x86.h
@@ -15,6 +15,8 @@
 
 #endif
 
+#define MCONTEXT_FPREGS
+
 #ifndef PKEY_DISABLE_ACCESS
 # define PKEY_DISABLE_ACCESS	0x1
 #endif
diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c
index 48dc151f8fca..b3dbd76ea27c 100644
--- a/tools/testing/selftests/mm/protection_keys.c
+++ b/tools/testing/selftests/mm/protection_keys.c
@@ -314,7 +314,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 	ucontext_t *uctxt = vucontext;
 	int trapno;
 	unsigned long ip;
+#ifdef MCONTEXT_FPREGS
 	char *fpregs;
+#endif
 #if defined(__i386__) || defined(__x86_64__) /* arch */
 	u32 *pkey_reg_ptr;
 	int pkey_reg_offset;
@@ -330,7 +332,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 
 	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
 	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
+#ifdef MCONTEXT_FPREGS
 	fpregs = (char *) uctxt->uc_mcontext.fpregs;
+#endif
 
 	dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
 			__func__, trapno, ip, si_code_str(si->si_code),
@@ -359,7 +363,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 #endif /* arch */
 
 	dprintf1("siginfo: %p\n", si);
+#ifdef MCONTEXT_FPREGS
 	dprintf1(" fpregs: %p\n", fpregs);
+#endif
 
 	if ((si->si_code == SEGV_MAPERR) ||
 	    (si->si_code == SEGV_ACCERR) ||
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 25/29] selftests: mm: make protection_keys test work on arm64
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (23 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 24/29] selftests: mm: move fpregs printing Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 26/29] kselftest/arm64: add HWCAP test for FEAT_S1POE Joey Gouly
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

The encoding of the pkey register differs on arm64, than on x86/ppc. On those
platforms, a bit in the register is used to disable permissions, for arm64, a
bit enabled in the register indicates that the permission is allowed.

This drops two asserts of the form:
	 assert(read_pkey_reg() <= orig_pkey_reg);
Because on arm64 this doesn't hold, due to the encoding.

The pkey must be reset to both access allow and write allow in the signal
handler. pkey_access_allow() works currently for PowerPC as the
PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.

Access to the uc_mcontext is abstracted, as arm64 has a different structure.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
---
 .../arm64/signal/testcases/testcases.h        |   3 +
 tools/testing/selftests/mm/Makefile           |   2 +-
 tools/testing/selftests/mm/pkey-arm64.h       | 139 ++++++++++++++++++
 tools/testing/selftests/mm/pkey-helpers.h     |   8 +
 tools/testing/selftests/mm/pkey-powerpc.h     |   2 +
 tools/testing/selftests/mm/pkey-x86.h         |   2 +
 tools/testing/selftests/mm/protection_keys.c  | 103 +++++++++++--
 7 files changed, 247 insertions(+), 12 deletions(-)
 create mode 100644 tools/testing/selftests/mm/pkey-arm64.h

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h
index 3185e6875694..9872b8912714 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.h
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h
@@ -26,6 +26,9 @@
 #define HDR_SZ \
 	sizeof(struct _aarch64_ctx)
 
+#define GET_UC_RESV_HEAD(uc) \
+	(struct _aarch64_ctx *)(&(uc->uc_mcontext.__reserved))
+
 #define GET_SF_RESV_HEAD(sf) \
 	(struct _aarch64_ctx *)(&(sf).uc.uc_mcontext.__reserved)
 
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index eb5f39a2668b..18642fb4966f 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -98,7 +98,7 @@ TEST_GEN_FILES += $(BINARIES_64)
 endif
 else
 
-ifneq (,$(findstring $(ARCH),ppc64))
+ifneq (,$(filter $(ARCH),arm64 ppc64))
 TEST_GEN_FILES += protection_keys
 endif
 
diff --git a/tools/testing/selftests/mm/pkey-arm64.h b/tools/testing/selftests/mm/pkey-arm64.h
new file mode 100644
index 000000000000..d17cad022100
--- /dev/null
+++ b/tools/testing/selftests/mm/pkey-arm64.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Arm Ltd.
+ */
+
+#ifndef _PKEYS_ARM64_H
+#define _PKEYS_ARM64_H
+
+#include "vm_util.h"
+/* for signal frame parsing */
+#include "../arm64/signal/testcases/testcases.h"
+
+#ifndef SYS_mprotect_key
+# define SYS_mprotect_key	288
+#endif
+#ifndef SYS_pkey_alloc
+# define SYS_pkey_alloc		289
+# define SYS_pkey_free		290
+#endif
+#define MCONTEXT_IP(mc)		mc.pc
+#define MCONTEXT_TRAPNO(mc)	-1
+
+#define PKEY_MASK		0xf
+
+#define POE_NONE		0x0
+#define POE_X			0x2
+#define POE_RX			0x3
+#define POE_RWX			0x7
+
+#define NR_PKEYS		7
+#define NR_RESERVED_PKEYS	1 /* pkey-0 */
+
+#define PKEY_ALLOW_ALL		0x77777777
+
+#define PKEY_BITS_PER_PKEY	4
+#define PAGE_SIZE		sysconf(_SC_PAGESIZE)
+#undef HPAGE_SIZE
+#define HPAGE_SIZE		default_huge_page_size()
+
+/* 4-byte instructions * 16384 = 64K page */
+#define __page_o_noops() asm(".rept 16384 ; nop; .endr")
+
+static inline u64 __read_pkey_reg(void)
+{
+	u64 pkey_reg = 0;
+
+	// POR_EL0
+	asm volatile("mrs %0, S3_3_c10_c2_4" : "=r" (pkey_reg));
+
+	return pkey_reg;
+}
+
+static inline void __write_pkey_reg(u64 pkey_reg)
+{
+	u64 por = pkey_reg;
+
+	dprintf4("%s() changing %016llx to %016llx\n",
+			 __func__, __read_pkey_reg(), pkey_reg);
+
+	// POR_EL0
+	asm volatile("msr S3_3_c10_c2_4, %0\nisb" :: "r" (por) :);
+
+	dprintf4("%s() pkey register after changing %016llx to %016llx\n",
+			__func__, __read_pkey_reg(), pkey_reg);
+}
+
+static inline int cpu_has_pkeys(void)
+{
+	/* No simple way to determine this */
+	return 1;
+}
+
+static inline u32 pkey_bit_position(int pkey)
+{
+	return pkey * PKEY_BITS_PER_PKEY;
+}
+
+static inline int get_arch_reserved_keys(void)
+{
+	return NR_RESERVED_PKEYS;
+}
+
+void expect_fault_on_read_execonly_key(void *p1, int pkey)
+{
+}
+
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+	return PTR_ERR_ENOTSUP;
+}
+
+#define set_pkey_bits	set_pkey_bits
+static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags)
+{
+	u32 shift = pkey_bit_position(pkey);
+	u64 new_val = POE_RWX;
+
+	/* mask out bits from pkey in old value */
+	reg &= ~((u64)PKEY_MASK << shift);
+
+	if (flags & PKEY_DISABLE_ACCESS)
+		new_val = POE_X;
+	else if (flags & PKEY_DISABLE_WRITE)
+		new_val = POE_RX;
+
+	/* OR in new bits for pkey */
+	reg |= new_val << shift;
+
+	return reg;
+}
+
+#define get_pkey_bits	get_pkey_bits
+static inline u64 get_pkey_bits(u64 reg, int pkey)
+{
+	u32 shift = pkey_bit_position(pkey);
+	/*
+	 * shift down the relevant bits to the lowest two, then
+	 * mask off all the other higher bits
+	 */
+	u32 perm = (reg >> shift) & PKEY_MASK;
+
+	if (perm == POE_X)
+		return PKEY_DISABLE_ACCESS;
+	if (perm == POE_RX)
+		return PKEY_DISABLE_WRITE;
+	return 0;
+}
+
+static void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey)
+{
+	struct _aarch64_ctx *ctx = GET_UC_RESV_HEAD(uctxt);
+	struct poe_context *poe_ctx =
+		(struct poe_context *) get_header(ctx, POE_MAGIC,
+						sizeof(uctxt->uc_mcontext), NULL);
+	if (poe_ctx)
+		poe_ctx->por_el0 = pkey;
+}
+
+#endif /* _PKEYS_ARM64_H */
diff --git a/tools/testing/selftests/mm/pkey-helpers.h b/tools/testing/selftests/mm/pkey-helpers.h
index 1af3156a9db8..15608350fc01 100644
--- a/tools/testing/selftests/mm/pkey-helpers.h
+++ b/tools/testing/selftests/mm/pkey-helpers.h
@@ -91,12 +91,17 @@ void record_pkey_malloc(void *ptr, long size, int prot);
 #include "pkey-x86.h"
 #elif defined(__powerpc64__) /* arch */
 #include "pkey-powerpc.h"
+#elif defined(__aarch64__) /* arch */
+#include "pkey-arm64.h"
 #else /* arch */
 #error Architecture not supported
 #endif /* arch */
 
+#ifndef PKEY_MASK
 #define PKEY_MASK	(PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE)
+#endif
 
+#ifndef set_pkey_bits
 static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags)
 {
 	u32 shift = pkey_bit_position(pkey);
@@ -106,7 +111,9 @@ static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags)
 	reg |= (flags & PKEY_MASK) << shift;
 	return reg;
 }
+#endif
 
+#ifndef get_pkey_bits
 static inline u64 get_pkey_bits(u64 reg, int pkey)
 {
 	u32 shift = pkey_bit_position(pkey);
@@ -116,6 +123,7 @@ static inline u64 get_pkey_bits(u64 reg, int pkey)
 	 */
 	return ((reg >> shift) & PKEY_MASK);
 }
+#endif
 
 extern u64 shadow_pkey_reg;
 
diff --git a/tools/testing/selftests/mm/pkey-powerpc.h b/tools/testing/selftests/mm/pkey-powerpc.h
index 6275d0f474b3..3d0c0bdae5bc 100644
--- a/tools/testing/selftests/mm/pkey-powerpc.h
+++ b/tools/testing/selftests/mm/pkey-powerpc.h
@@ -8,6 +8,8 @@
 # define SYS_pkey_free		385
 #endif
 #define REG_IP_IDX		PT_NIP
+#define MCONTEXT_IP(mc)		mc.gp_regs[REG_IP_IDX]
+#define MCONTEXT_TRAPNO(mc)	mc.gp_regs[REG_TRAPNO]
 #define REG_TRAPNO		PT_TRAP
 #define MCONTEXT_FPREGS
 #define gregs			gp_regs
diff --git a/tools/testing/selftests/mm/pkey-x86.h b/tools/testing/selftests/mm/pkey-x86.h
index b9170a26bfcb..5f28e26a2511 100644
--- a/tools/testing/selftests/mm/pkey-x86.h
+++ b/tools/testing/selftests/mm/pkey-x86.h
@@ -15,6 +15,8 @@
 
 #endif
 
+#define MCONTEXT_IP(mc)		mc.gregs[REG_IP_IDX]
+#define MCONTEXT_TRAPNO(mc)	mc.gregs[REG_TRAPNO]
 #define MCONTEXT_FPREGS
 
 #ifndef PKEY_DISABLE_ACCESS
diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c
index b3dbd76ea27c..989fdf489e33 100644
--- a/tools/testing/selftests/mm/protection_keys.c
+++ b/tools/testing/selftests/mm/protection_keys.c
@@ -147,7 +147,7 @@ void abort_hooks(void)
  * will then fault, which makes sure that the fault code handles
  * execute-only memory properly.
  */
-#ifdef __powerpc64__
+#if defined(__powerpc64__) || defined(__aarch64__)
 /* This way, both 4K and 64K alignment are maintained */
 __attribute__((__aligned__(65536)))
 #else
@@ -212,7 +212,6 @@ void pkey_disable_set(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights;
-	u64 orig_pkey_reg = read_pkey_reg();
 
 	dprintf1("START->%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
@@ -242,8 +241,6 @@ void pkey_disable_set(int pkey, int flags)
 
 	dprintf1("%s(%d) pkey_reg: 0x%016llx\n",
 		__func__, pkey, read_pkey_reg());
-	if (flags)
-		pkey_assert(read_pkey_reg() >= orig_pkey_reg);
 	dprintf1("END<---%s(%d, 0x%x)\n", __func__,
 		pkey, flags);
 }
@@ -253,7 +250,6 @@ void pkey_disable_clear(int pkey, int flags)
 	unsigned long syscall_flags = 0;
 	int ret;
 	int pkey_rights = hw_pkey_get(pkey, syscall_flags);
-	u64 orig_pkey_reg = read_pkey_reg();
 
 	pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
@@ -273,8 +269,6 @@ void pkey_disable_clear(int pkey, int flags)
 
 	dprintf1("%s(%d) pkey_reg: 0x%016llx\n", __func__,
 			pkey, read_pkey_reg());
-	if (flags)
-		assert(read_pkey_reg() <= orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
@@ -330,8 +324,8 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 			__func__, __LINE__,
 			__read_pkey_reg(), shadow_pkey_reg);
 
-	trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
-	ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
+	trapno = MCONTEXT_TRAPNO(uctxt->uc_mcontext);
+	ip = MCONTEXT_IP(uctxt->uc_mcontext);
 #ifdef MCONTEXT_FPREGS
 	fpregs = (char *) uctxt->uc_mcontext.fpregs;
 #endif
@@ -395,6 +389,8 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
 #elif defined(__powerpc64__) /* arch */
 	/* restore access and let the faulting instruction continue */
 	pkey_access_allow(siginfo_pkey);
+#elif defined(__aarch64__)
+	aarch64_write_signal_pkey(uctxt, PKEY_ALLOW_ALL);
 #endif /* arch */
 	pkey_faults++;
 	dprintf1("<<<<==================================================\n");
@@ -908,7 +904,9 @@ void expected_pkey_fault(int pkey)
 	 * test program continue.  We now have to restore it.
 	 */
 	if (__read_pkey_reg() != 0)
-#else /* arch */
+#elif defined(__aarch64__)
+	if (__read_pkey_reg() != PKEY_ALLOW_ALL)
+#else
 	if (__read_pkey_reg() != shadow_pkey_reg)
 #endif /* arch */
 		pkey_assert(0);
@@ -1498,6 +1496,11 @@ void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
 	lots_o_noops_around_write(&scratch);
 	do_not_expect_pkey_fault("executing on PROT_EXEC memory");
 	expect_fault_on_read_execonly_key(p1, pkey);
+
+	// Reset back to PROT_EXEC | PROT_READ for architectures that support
+	// non-PKEY execute-only permissions.
+	ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC | PROT_READ, (u64)pkey);
+	pkey_assert(!ret);
 }
 
 void test_implicit_mprotect_exec_only_memory(int *ptr, u16 pkey)
@@ -1671,6 +1674,84 @@ void test_ptrace_modifies_pkru(int *ptr, u16 pkey)
 }
 #endif
 
+#if defined(__aarch64__)
+void test_ptrace_modifies_pkru(int *ptr, u16 pkey)
+{
+	pid_t child;
+	int status, ret;
+	struct iovec iov;
+	u64 trace_pkey;
+	/* Just a random pkey value.. */
+	u64 new_pkey = (POE_X << PKEY_BITS_PER_PKEY * 2) |
+			(POE_NONE << PKEY_BITS_PER_PKEY) |
+			POE_RWX;
+
+	child = fork();
+	pkey_assert(child >= 0);
+	dprintf3("[%d] fork() ret: %d\n", getpid(), child);
+	if (!child) {
+		ptrace(PTRACE_TRACEME, 0, 0, 0);
+
+		/* Stop and allow the tracer to modify PKRU directly */
+		raise(SIGSTOP);
+
+		/*
+		 * need __read_pkey_reg() version so we do not do shadow_pkey_reg
+		 * checking
+		 */
+		if (__read_pkey_reg() != new_pkey)
+			exit(1);
+
+		raise(SIGSTOP);
+
+		exit(0);
+	}
+
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
+
+	iov.iov_base = &trace_pkey;
+	iov.iov_len = 8;
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_ARM_POE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(trace_pkey == read_pkey_reg());
+
+	trace_pkey = new_pkey;
+
+	ret = ptrace(PTRACE_SETREGSET, child, (void *)NT_ARM_POE, &iov);
+	pkey_assert(ret == 0);
+
+	/* Test that the modification is visible in ptrace before any execution */
+	memset(&trace_pkey, 0, sizeof(trace_pkey));
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_ARM_POE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(trace_pkey == new_pkey);
+
+	/* Execute the tracee */
+	ret = ptrace(PTRACE_CONT, child, 0, 0);
+	pkey_assert(ret == 0);
+
+	/* Test that the tracee saw the PKRU value change */
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
+
+	/* Test that the modification is visible in ptrace after execution */
+	memset(&trace_pkey, 0, sizeof(trace_pkey));
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_ARM_POE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(trace_pkey == new_pkey);
+
+	ret = ptrace(PTRACE_CONT, child, 0, 0);
+	pkey_assert(ret == 0);
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFEXITED(status));
+	pkey_assert(WEXITSTATUS(status) == 0);
+}
+#endif
+
 void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 {
 	int size = PAGE_SIZE;
@@ -1706,7 +1787,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey) = {
 	test_pkey_syscalls_bad_args,
 	test_pkey_alloc_exhaust,
 	test_pkey_alloc_free_attach_pkey0,
-#if defined(__i386__) || defined(__x86_64__)
+#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
 	test_ptrace_modifies_pkru,
 #endif
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 26/29] kselftest/arm64: add HWCAP test for FEAT_S1POE
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (24 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 25/29] selftests: mm: make protection_keys test work on arm64 Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 27/29] kselftest/arm64: parse POE_MAGIC in a signal frame Joey Gouly
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Check that when POE is enabled, the POR_EL0 register is accessible.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/arm64/abi/hwcap.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c
index d8909b2b535a..f2d6007a2b98 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -156,6 +156,12 @@ static void pmull_sigill(void)
 	asm volatile(".inst 0x0ee0e000" : : : );
 }
 
+static void poe_sigill(void)
+{
+	/* mrs x0, POR_EL0 */
+	asm volatile("mrs x0, S3_3_C10_C2_4" : : : "x0");
+}
+
 static void rng_sigill(void)
 {
 	asm volatile("mrs x0, S3_3_C2_C4_0" : : : "x0");
@@ -601,6 +607,14 @@ static const struct hwcap_data {
 		.cpuinfo = "pmull",
 		.sigill_fn = pmull_sigill,
 	},
+	{
+		.name = "POE",
+		.at_hwcap = AT_HWCAP2,
+		.hwcap_bit = HWCAP2_POE,
+		.cpuinfo = "poe",
+		.sigill_fn = poe_sigill,
+		.sigill_reliable = true,
+	},
 	{
 		.name = "RNG",
 		.at_hwcap = AT_HWCAP2,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 27/29] kselftest/arm64: parse POE_MAGIC in a signal frame
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (25 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 26/29] kselftest/arm64: add HWCAP test for FEAT_S1POE Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-03 13:01 ` [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records Joey Gouly
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Teach the signal frame parsing about the new POE frame, avoids warning when it
is generated.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/arm64/signal/testcases/testcases.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c b/tools/testing/selftests/arm64/signal/testcases/testcases.c
index e4331440fed0..e6daa94fcd2e 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.c
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c
@@ -161,6 +161,10 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err)
 			if (head->size != sizeof(struct esr_context))
 				*err = "Bad size for esr_context";
 			break;
+		case POE_MAGIC:
+			if (head->size != sizeof(struct poe_context))
+				*err = "Bad size for poe_context";
+			break;
 		case TPIDR2_MAGIC:
 			if (head->size != sizeof(struct tpidr2_context))
 				*err = "Bad size for tpidr2_context";
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (26 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 27/29] kselftest/arm64: parse POE_MAGIC in a signal frame Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-29 15:51   ` Mark Brown
  2024-07-09 13:10   ` Kevin Brodsky
  2024-05-03 13:01 ` [PATCH v4 29/29] KVM: selftests: get-reg-list: add Permission Overlay registers Joey Gouly
                   ` (2 subsequent siblings)
  30 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Ensure that we get signal context for POR_EL0 if and only if POE is present
on the system.

Copied from the TPIDR2 test.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
---
 .../testing/selftests/arm64/signal/.gitignore |  1 +
 .../arm64/signal/testcases/poe_siginfo.c      | 86 +++++++++++++++++++
 2 files changed, 87 insertions(+)
 create mode 100644 tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c

diff --git a/tools/testing/selftests/arm64/signal/.gitignore b/tools/testing/selftests/arm64/signal/.gitignore
index 1ce5b5eac386..b2f2bfd5c6aa 100644
--- a/tools/testing/selftests/arm64/signal/.gitignore
+++ b/tools/testing/selftests/arm64/signal/.gitignore
@@ -2,6 +2,7 @@
 mangle_*
 fake_sigreturn_*
 fpmr_*
+poe_*
 sme_*
 ssve_*
 sve_*
diff --git a/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c b/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
new file mode 100644
index 000000000000..d890029304c4
--- /dev/null
+++ b/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023 Arm Limited
+ *
+ * Verify that the POR_EL0 register context in signal frames is set up as
+ * expected.
+ */
+
+#include <signal.h>
+#include <ucontext.h>
+#include <sys/auxv.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+#include <asm/sigcontext.h>
+
+#include "test_signals_utils.h"
+#include "testcases.h"
+
+static union {
+	ucontext_t uc;
+	char buf[1024 * 128];
+} context;
+
+#define SYS_POR_EL0 "S3_3_C10_C2_4"
+
+static uint64_t get_por_el0(void)
+{
+	uint64_t val;
+
+	asm volatile (
+		"mrs	%0, " SYS_POR_EL0 "\n"
+		: "=r"(val)
+		:
+		: "cc");
+
+	return val;
+}
+
+int poe_present(struct tdescr *td, siginfo_t *si, ucontext_t *uc)
+{
+	struct _aarch64_ctx *head = GET_BUF_RESV_HEAD(context);
+	struct poe_context *poe_ctx;
+	size_t offset;
+	bool in_sigframe;
+	bool have_poe;
+	__u64 orig_poe;
+
+	have_poe = getauxval(AT_HWCAP2) & HWCAP2_POE;
+	if (have_poe)
+		orig_poe = get_por_el0();
+
+	if (!get_current_context(td, &context.uc, sizeof(context)))
+		return 1;
+
+	poe_ctx = (struct poe_context *)
+		get_header(head, POE_MAGIC, td->live_sz, &offset);
+
+	in_sigframe = poe_ctx != NULL;
+
+	fprintf(stderr, "POR_EL0 sigframe %s on system %s POE\n",
+		in_sigframe ? "present" : "absent",
+		have_poe ? "with" : "without");
+
+	td->pass = (in_sigframe == have_poe);
+
+	/*
+	 * Check that the value we read back was the one present at
+	 * the time that the signal was triggered.
+	 */
+	if (have_poe && poe_ctx) {
+		if (poe_ctx->por_el0 != orig_poe) {
+			fprintf(stderr, "POR_EL0 in frame is %llx, was %llx\n",
+				poe_ctx->por_el0, orig_poe);
+			td->pass = false;
+		}
+	}
+
+	return 0;
+}
+
+struct tdescr tde = {
+	.name = "POR_EL0",
+	.descr = "Validate that POR_EL0 is present as expected",
+	.timeout = 3,
+	.run = poe_present,
+};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 29/29] KVM: selftests: get-reg-list: add Permission Overlay registers
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (27 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records Joey Gouly
@ 2024-05-03 13:01 ` Joey Gouly
  2024-05-05 14:41 ` [PATCH v4 00/29] arm64: Permission Overlay Extension Mark Brown
  2024-05-28 11:30 ` Joey Gouly
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-03 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Add new system registers:
  - POR_EL1
  - POR_EL0

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/kvm/aarch64/get-reg-list.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
index 709d7d721760..ac661ebf6859 100644
--- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c
+++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
@@ -40,6 +40,18 @@ static struct feature_id_reg feat_id_regs[] = {
 		ARM64_SYS_REG(3, 0, 0, 7, 3),	/* ID_AA64MMFR3_EL1 */
 		4,
 		1
+	},
+	{
+		ARM64_SYS_REG(3, 0, 10, 2, 4),	/* POR_EL1 */
+		ARM64_SYS_REG(3, 0, 0, 7, 3),	/* ID_AA64MMFR3_EL1 */
+		16,
+		1
+	},
+	{
+		ARM64_SYS_REG(3, 3, 10, 2, 4),	/* POR_EL0 */
+		ARM64_SYS_REG(3, 0, 0, 7, 3),	/* ID_AA64MMFR3_EL1 */
+		16,
+		1
 	}
 };
 
@@ -468,6 +480,7 @@ static __u64 base_regs[] = {
 	ARM64_SYS_REG(3, 0, 10, 2, 0),	/* MAIR_EL1 */
 	ARM64_SYS_REG(3, 0, 10, 2, 2),	/* PIRE0_EL1 */
 	ARM64_SYS_REG(3, 0, 10, 2, 3),	/* PIR_EL1 */
+	ARM64_SYS_REG(3, 0, 10, 2, 4),	/* POR_EL1 */
 	ARM64_SYS_REG(3, 0, 10, 3, 0),	/* AMAIR_EL1 */
 	ARM64_SYS_REG(3, 0, 12, 0, 0),	/* VBAR_EL1 */
 	ARM64_SYS_REG(3, 0, 12, 1, 1),	/* DISR_EL1 */
@@ -475,6 +488,7 @@ static __u64 base_regs[] = {
 	ARM64_SYS_REG(3, 0, 13, 0, 4),	/* TPIDR_EL1 */
 	ARM64_SYS_REG(3, 0, 14, 1, 0),	/* CNTKCTL_EL1 */
 	ARM64_SYS_REG(3, 2, 0, 0, 0),	/* CSSELR_EL1 */
+	ARM64_SYS_REG(3, 3, 10, 2, 4),	/* POR_EL0 */
 	ARM64_SYS_REG(3, 3, 13, 0, 2),	/* TPIDR_EL0 */
 	ARM64_SYS_REG(3, 3, 13, 0, 3),	/* TPIDRRO_EL0 */
 	ARM64_SYS_REG(3, 3, 14, 0, 1),	/* CNTPCT_EL0 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 02/29] x86/mm: add ARCH_PKEY_BITS to Kconfig
  2024-05-03 13:01 ` [PATCH v4 02/29] x86/mm: " Joey Gouly
@ 2024-05-03 16:40   ` Dave Hansen
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Hansen @ 2024-05-03 16:40 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 5/3/24 06:01, Joey Gouly wrote:
> The new config option specifies how many bits are in each PKEY.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN
  2024-05-03 13:01 ` [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN Joey Gouly
@ 2024-05-03 16:41   ` Dave Hansen
  2024-07-15  7:53   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Dave Hansen @ 2024-05-03 16:41 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 5/3/24 06:01, Joey Gouly wrote:
>  #ifdef CONFIG_ARCH_HAS_PKEYS
> -# define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
> -# define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
> -# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
> -# define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> -# define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> -#ifdef CONFIG_PPC
> +# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
> +# define VM_PKEY_BIT0  VM_HIGH_ARCH_0
> +# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
> +# define VM_PKEY_BIT2  VM_HIGH_ARCH_2
> +#if CONFIG_ARCH_PKEY_BITS > 3
> +# define VM_PKEY_BIT3  VM_HIGH_ARCH_3
> +#else
> +# define VM_PKEY_BIT3  0
> +#endif
> +#if CONFIG_ARCH_PKEY_BITS > 4

It's certainly not pretty, but it does get the arch #ifdef out of
generic code.  We might need to rethink this if we get another
architecture or two, but this seems manageable for now.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 00/29] arm64: Permission Overlay Extension
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (28 preceding siblings ...)
  2024-05-03 13:01 ` [PATCH v4 29/29] KVM: selftests: get-reg-list: add Permission Overlay registers Joey Gouly
@ 2024-05-05 14:41 ` Mark Brown
  2024-05-28 11:30 ` Joey Gouly
  30 siblings, 0 replies; 146+ messages in thread
From: Mark Brown @ 2024-05-05 14:41 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 201 bytes --]

On Fri, May 03, 2024 at 02:01:18PM +0100, Joey Gouly wrote:

> One possible issue with this version, I took the last bit of HWCAP2.

Fortunately we already have AT_HWCAP[34] defined thanks to PowerPC.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig
  2024-05-03 13:01 ` [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig Joey Gouly
@ 2024-05-06  8:57   ` Michael Ellerman
  0 siblings, 0 replies; 146+ messages in thread
From: Michael Ellerman @ 2024-05-06  8:57 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, joey.gouly, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Joey Gouly <joey.gouly@arm.com> writes:
> The new config option specifies how many bits are in each PKEY.
>
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 1c4be3373686..6e33e4726856 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -1020,6 +1020,10 @@ config PPC_MEM_KEYS
>  
>  	  If unsure, say y.
>  
> +config ARCH_PKEY_BITS
> +	int
> +	default 5
> +
>  config PPC_SECURE_BOOT
>  	prompt "Enable secure boot support"
>  	bool
> -- 
> 2.25.1

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
@ 2024-05-28  6:54   ` Amit Daniel Kachhap
  2024-06-19 16:45     ` Catalin Marinas
  2024-07-16  9:05   ` Anshuman Khandual
  2024-07-25 15:49   ` Dave Martin
  2 siblings, 1 reply; 146+ messages in thread
From: Amit Daniel Kachhap @ 2024-05-28  6:54 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> value is set in the vm_flags and then into the pgprot value.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>   arch/arm64/include/asm/mman.h | 8 +++++++-
>   arch/arm64/mm/mmap.c          | 9 +++++++++
>   2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> index 5966ee4a6154..ecb2d18dc4d7 100644
> --- a/arch/arm64/include/asm/mman.h
> +++ b/arch/arm64/include/asm/mman.h
> @@ -7,7 +7,7 @@
>   #include <uapi/asm/mman.h>
>   
>   static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> -	unsigned long pkey __always_unused)
> +	unsigned long pkey)
>   {
>   	unsigned long ret = 0;
>   
> @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>   	if (system_supports_mte() && (prot & PROT_MTE))
>   		ret |= VM_MTE;
>   
> +#if defined(CONFIG_ARCH_HAS_PKEYS)

Should there be system_supports_poe() check like above?

Thanks,
Amit

> +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> +#endif
> +
>   	return ret;
>   }
>   #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> index 642bdf908b22..86eda6bc7893 100644
> --- a/arch/arm64/mm/mmap.c
> +++ b/arch/arm64/mm/mmap.c
> @@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
>   	if (vm_flags & VM_MTE)
>   		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
>   
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +	if (vm_flags & VM_PKEY_BIT0)
> +		prot |= PTE_PO_IDX_0;
> +	if (vm_flags & VM_PKEY_BIT1)
> +		prot |= PTE_PO_IDX_1;
> +	if (vm_flags & VM_PKEY_BIT2)
> +		prot |= PTE_PO_IDX_2;
> +#endif
> +
>   	return __pgprot(prot);
>   }
>   EXPORT_SYMBOL(vm_get_page_prot);

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
@ 2024-05-28  6:55   ` Amit Daniel Kachhap
  2024-05-28 11:26     ` Joey Gouly
  2024-05-31 14:57   ` Szabolcs Nagy
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 146+ messages in thread
From: Amit Daniel Kachhap @ 2024-05-28  6:55 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Implement the PKEYS interface, using the Permission Overlay Extension.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>   arch/arm64/include/asm/mmu.h         |   1 +
>   arch/arm64/include/asm/mmu_context.h |  51 ++++++++++++-
>   arch/arm64/include/asm/pgtable.h     |  22 +++++-
>   arch/arm64/include/asm/pkeys.h       | 110 +++++++++++++++++++++++++++
>   arch/arm64/include/asm/por.h         |  33 ++++++++
>   arch/arm64/mm/mmu.c                  |  40 ++++++++++
>   6 files changed, 255 insertions(+), 2 deletions(-)
>   create mode 100644 arch/arm64/include/asm/pkeys.h
>   create mode 100644 arch/arm64/include/asm/por.h
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 65977c7783c5..983afeb4eba5 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -25,6 +25,7 @@ typedef struct {
>   	refcount_t	pinned;
>   	void		*vdso;
>   	unsigned long	flags;
> +	u8		pkey_allocation_map;
>   } mm_context_t;
>   
>   /*
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index c768d16b81a4..cb499db7a97b 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -15,12 +15,12 @@
>   #include <linux/sched/hotplug.h>
>   #include <linux/mm_types.h>
>   #include <linux/pgtable.h>
> +#include <linux/pkeys.h>
>   
>   #include <asm/cacheflush.h>
>   #include <asm/cpufeature.h>
>   #include <asm/daifflags.h>
>   #include <asm/proc-fns.h>
> -#include <asm-generic/mm_hooks.h>
>   #include <asm/cputype.h>
>   #include <asm/sysreg.h>
>   #include <asm/tlbflush.h>
> @@ -175,9 +175,36 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
>   {
>   	atomic64_set(&mm->context.id, 0);
>   	refcount_set(&mm->context.pinned, 0);
> +
> +	/* pkey 0 is the default, so always reserve it. */
> +	mm->context.pkey_allocation_map = 0x1;
> +
> +	return 0;
> +}
> +
> +static inline void arch_dup_pkeys(struct mm_struct *oldmm,
> +				  struct mm_struct *mm)
> +{
> +	/* Duplicate the oldmm pkey state in mm: */
> +	mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
> +}
> +
> +static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
> +{
> +	arch_dup_pkeys(oldmm, mm);
> +
>   	return 0;
>   }
>   
> +static inline void arch_exit_mmap(struct mm_struct *mm)
> +{
> +}
> +
> +static inline void arch_unmap(struct mm_struct *mm,
> +			unsigned long start, unsigned long end)
> +{
> +}
> +
>   #ifdef CONFIG_ARM64_SW_TTBR0_PAN
>   static inline void update_saved_ttbr0(struct task_struct *tsk,
>   				      struct mm_struct *mm)
> @@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
>   	return -1UL >> 8;
>   }
>   
> +/*
> + * We only want to enforce protection keys on the current process
> + * because we effectively have no access to POR_EL0 for other
> + * processes or any way to tell *which * POR_EL0 in a threaded
> + * process we could use.
> + *
> + * So do not enforce things if the VMA is not from the current
> + * mm, or if we are in a kernel thread.
> + */
> +static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +		bool write, bool execute, bool foreign)
> +{
> +	if (!arch_pkeys_enabled())
> +		return true;

The above check can be dropped as the caller of this function
fault_from_pkey() does the same check.

Thanks,
Amit

> +
> +	/* allow access if the VMA is not one from this process */
> +	if (foreign || vma_is_foreign(vma))
> +		return true;
> +
> +	return por_el0_allows_pkey(vma_pkey(vma), write, execute);
> +}
> +
>   #include <asm-generic/mmu_context.h>
>   
>   #endif /* !__ASSEMBLY__ */
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 2449e4e27ea6..8ee68ff03016 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -34,6 +34,7 @@
>   
>   #include <asm/cmpxchg.h>
>   #include <asm/fixmap.h>
> +#include <asm/por.h>
>   #include <linux/mmdebug.h>
>   #include <linux/mm_types.h>
>   #include <linux/sched.h>
> @@ -153,6 +154,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>   #define pte_accessible(mm, pte)	\
>   	(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
>   
> +static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
> +{
> +	u64 por;
> +
> +	if (!system_supports_poe())
> +		return true;
> +
> +	por = read_sysreg_s(SYS_POR_EL0);
> +
> +	if (write)
> +		return por_elx_allows_write(por, pkey);
> +
> +	if (execute)
> +		return por_elx_allows_exec(por, pkey);
> +
> +	return por_elx_allows_read(por, pkey);
> +}
> +
>   /*
>    * p??_access_permitted() is true for valid user mappings (PTE_USER
>    * bit set, subject to the write permission check). For execute-only
> @@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>   #define pte_access_permitted_no_overlay(pte, write) \
>   	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
>   #define pte_access_permitted(pte, write) \
> -	pte_access_permitted_no_overlay(pte, write)
> +	(pte_access_permitted_no_overlay(pte, write) && \
> +	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
>   #define pmd_access_permitted(pmd, write) \
>   	(pte_access_permitted(pmd_pte(pmd), (write)))
>   #define pud_access_permitted(pud, write) \
> diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> new file mode 100644
> index 000000000000..a284508a4d02
> --- /dev/null
> +++ b/arch/arm64/include/asm/pkeys.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + *
> + * Based on arch/x86/include/asm/pkeys.h
> + */
> +
> +#ifndef _ASM_ARM64_PKEYS_H
> +#define _ASM_ARM64_PKEYS_H
> +
> +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
> +
> +#define arch_max_pkey() 7
> +
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> +		unsigned long init_val);
> +
> +static inline bool arch_pkeys_enabled(void)
> +{
> +	return false;
> +}
> +
> +static inline int vma_pkey(struct vm_area_struct *vma)
> +{
> +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> +}
> +
> +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey)
> +{
> +	if (pkey != -1)
> +		return pkey;
> +
> +	return vma_pkey(vma);
> +}
> +
> +static inline int execute_only_pkey(struct mm_struct *mm)
> +{
> +	// Execute-only mappings are handled by EPAN/FEAT_PAN3.
> +	WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN));
> +
> +	return -1;
> +}
> +
> +#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)
> +#define mm_set_pkey_allocated(mm, pkey) do {		\
> +	mm_pkey_allocation_map(mm) |= (1U << pkey);	\
> +} while (0)
> +#define mm_set_pkey_free(mm, pkey) do {			\
> +	mm_pkey_allocation_map(mm) &= ~(1U << pkey);	\
> +} while (0)
> +
> +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> +{
> +	/*
> +	 * "Allocated" pkeys are those that have been returned
> +	 * from pkey_alloc() or pkey 0 which is allocated
> +	 * implicitly when the mm is created.
> +	 */
> +	if (pkey < 0)
> +		return false;
> +	if (pkey >= arch_max_pkey())
> +		return false;
> +
> +	return mm_pkey_allocation_map(mm) & (1U << pkey);
> +}
> +
> +/*
> + * Returns a positive, 3-bit key on success, or -1 on failure.
> + */
> +static inline int mm_pkey_alloc(struct mm_struct *mm)
> +{
> +	/*
> +	 * Note: this is the one and only place we make sure
> +	 * that the pkey is valid as far as the hardware is
> +	 * concerned.  The rest of the kernel trusts that
> +	 * only good, valid pkeys come out of here.
> +	 */
> +	u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1);
> +	int ret;
> +
> +	if (!arch_pkeys_enabled())
> +		return -1;
> +
> +	/*
> +	 * Are we out of pkeys?  We must handle this specially
> +	 * because ffz() behavior is undefined if there are no
> +	 * zeros.
> +	 */
> +	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
> +		return -1;
> +
> +	ret = ffz(mm_pkey_allocation_map(mm));
> +
> +	mm_set_pkey_allocated(mm, ret);
> +
> +	return ret;
> +}
> +
> +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> +{
> +	if (!mm_pkey_is_allocated(mm, pkey))
> +		return -EINVAL;
> +
> +	mm_set_pkey_free(mm, pkey);
> +
> +	return 0;
> +}
> +
> +#endif /* _ASM_ARM64_PKEYS_H */
> diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
> new file mode 100644
> index 000000000000..d6604e0c5c54
> --- /dev/null
> +++ b/arch/arm64/include/asm/por.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + */
> +
> +#ifndef _ASM_ARM64_POR_H
> +#define _ASM_ARM64_POR_H
> +
> +#define POR_BITS_PER_PKEY		4
> +#define POR_ELx_IDX(por_elx, idx)	(((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf)
> +
> +static inline bool por_elx_allows_read(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_R;
> +}
> +
> +static inline bool por_elx_allows_write(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_W;
> +}
> +
> +static inline bool por_elx_allows_exec(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_X;
> +}
> +
> +#endif /* _ASM_ARM64_POR_H */
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 495b732d5af3..e50ccc86d150 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -25,6 +25,7 @@
>   #include <linux/vmalloc.h>
>   #include <linux/set_memory.h>
>   #include <linux/kfence.h>
> +#include <linux/pkeys.h>
>   
>   #include <asm/barrier.h>
>   #include <asm/cputype.h>
> @@ -1535,3 +1536,42 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
>   
>   	cpu_uninstall_idmap();
>   }
> +
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> +{
> +	u64 new_por = POE_RXW;
> +	u64 old_por;
> +	u64 pkey_shift;
> +
> +	if (!arch_pkeys_enabled())
> +		return -ENOSPC;
> +
> +	/*
> +	 * This code should only be called with valid 'pkey'
> +	 * values originating from in-kernel users.  Complain
> +	 * if a bad value is observed.
> +	 */
> +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> +		return -EINVAL;
> +
> +	/* Set the bits we need in POR:  */
> +	if (init_val & PKEY_DISABLE_ACCESS)
> +		new_por = POE_X;
> +	else if (init_val & PKEY_DISABLE_WRITE)
> +		new_por = POE_RX;
> +
> +	/* Shift the bits in to the correct place in POR for pkey: */
> +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> +	new_por <<= pkey_shift;
> +
> +	/* Get old POR and mask off any old bits in place: */
> +	old_por = read_sysreg_s(SYS_POR_EL0);
> +	old_por &= ~(POE_MASK << pkey_shift);
> +
> +	/* Write old part along with new part: */
> +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> +
> +	return 0;
> +}
> +#endif

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
@ 2024-05-28  6:56   ` Amit Daniel Kachhap
  2024-05-31 16:39     ` Mark Brown
  2024-07-05 17:04   ` Catalin Marinas
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 146+ messages in thread
From: Amit Daniel Kachhap @ 2024-05-28  6:56 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> ---
>   arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
>   arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
>   2 files changed, 59 insertions(+)
> 
> diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> index 8a45b7a411e0..e4cba8a6c9a2 100644
> --- a/arch/arm64/include/uapi/asm/sigcontext.h
> +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> @@ -98,6 +98,13 @@ struct esr_context {
>   	__u64 esr;
>   };
>   
> +#define POE_MAGIC	0x504f4530
> +
> +struct poe_context {
> +	struct _aarch64_ctx head;
> +	__u64 por_el0;
> +};

There is a comment section in the beginning which mentions the size
of the context frame structure and subsequent reduction in the
reserved range. So this new context description can be added there.
Although looks like it is broken for za, zt and fpmr context.

> +
>   /*
>    * extra_context: describes extra space in the signal frame for
>    * additional structures that don't fit in sigcontext.__reserved[].
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 4a77f4976e11..077436a8bc10 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -63,6 +63,7 @@ struct rt_sigframe_user_layout {
>   	unsigned long fpmr_offset;
>   	unsigned long extra_offset;
>   	unsigned long end_offset;
> +	unsigned long poe_offset;

For consistency this can be added after fpmr_offset.

Thanks,
Amit

>   };
>   
>   #define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
> @@ -185,6 +186,8 @@ struct user_ctxs {
>   	u32 zt_size;
>   	struct fpmr_context __user *fpmr;
>   	u32 fpmr_size;
> +	struct poe_context __user *poe;

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-28  6:55   ` Amit Daniel Kachhap
@ 2024-05-28 11:26     ` Joey Gouly
  0 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-28 11:26 UTC (permalink / raw)
  To: Amit Daniel Kachhap
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi Amit,

Thanks for taking a look!

On Tue, May 28, 2024 at 12:25:58PM +0530, Amit Daniel Kachhap wrote:
> 
> 
> On 5/3/24 18:31, Joey Gouly wrote:
> > Implement the PKEYS interface, using the Permission Overlay Extension.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >   arch/arm64/include/asm/mmu.h         |   1 +
> >   arch/arm64/include/asm/mmu_context.h |  51 ++++++++++++-
> >   arch/arm64/include/asm/pgtable.h     |  22 +++++-
> >   arch/arm64/include/asm/pkeys.h       | 110 +++++++++++++++++++++++++++
> >   arch/arm64/include/asm/por.h         |  33 ++++++++
> >   arch/arm64/mm/mmu.c                  |  40 ++++++++++
> >   6 files changed, 255 insertions(+), 2 deletions(-)
> >   create mode 100644 arch/arm64/include/asm/pkeys.h
> >   create mode 100644 arch/arm64/include/asm/por.h
> > 
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index 65977c7783c5..983afeb4eba5 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -25,6 +25,7 @@ typedef struct {
> >   	refcount_t	pinned;
> >   	void		*vdso;
> >   	unsigned long	flags;
> > +	u8		pkey_allocation_map;
> >   } mm_context_t;
> >   /*
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index c768d16b81a4..cb499db7a97b 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -15,12 +15,12 @@
> >   #include <linux/sched/hotplug.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/pgtable.h>
> > +#include <linux/pkeys.h>
> >   #include <asm/cacheflush.h>
> >   #include <asm/cpufeature.h>
> >   #include <asm/daifflags.h>
> >   #include <asm/proc-fns.h>
> > -#include <asm-generic/mm_hooks.h>
> >   #include <asm/cputype.h>
> >   #include <asm/sysreg.h>
> >   #include <asm/tlbflush.h>
> > @@ -175,9 +175,36 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
> >   {
> >   	atomic64_set(&mm->context.id, 0);
> >   	refcount_set(&mm->context.pinned, 0);
> > +
> > +	/* pkey 0 is the default, so always reserve it. */
> > +	mm->context.pkey_allocation_map = 0x1;
> > +
> > +	return 0;
> > +}
> > +
> > +static inline void arch_dup_pkeys(struct mm_struct *oldmm,
> > +				  struct mm_struct *mm)
> > +{
> > +	/* Duplicate the oldmm pkey state in mm: */
> > +	mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
> > +}
> > +
> > +static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
> > +{
> > +	arch_dup_pkeys(oldmm, mm);
> > +
> >   	return 0;
> >   }
> > +static inline void arch_exit_mmap(struct mm_struct *mm)
> > +{
> > +}
> > +
> > +static inline void arch_unmap(struct mm_struct *mm,
> > +			unsigned long start, unsigned long end)
> > +{
> > +}
> > +
> >   #ifdef CONFIG_ARM64_SW_TTBR0_PAN
> >   static inline void update_saved_ttbr0(struct task_struct *tsk,
> >   				      struct mm_struct *mm)
> > @@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
> >   	return -1UL >> 8;
> >   }
> > +/*
> > + * We only want to enforce protection keys on the current process
> > + * because we effectively have no access to POR_EL0 for other
> > + * processes or any way to tell *which * POR_EL0 in a threaded
> > + * process we could use.
> > + *
> > + * So do not enforce things if the VMA is not from the current
> > + * mm, or if we are in a kernel thread.
> > + */
> > +static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> > +		bool write, bool execute, bool foreign)
> > +{
> > +	if (!arch_pkeys_enabled())
> > +		return true;
> 
> The above check can be dropped as the caller of this function
> fault_from_pkey() does the same check.

arch_vma_access_permitted() is called by other places in the kernel, so I need to leave that check in.

Thanks,
Joey

> 
> Thanks,
> Amit
> 
> > +
> > +	/* allow access if the VMA is not one from this process */
> > +	if (foreign || vma_is_foreign(vma))
> > +		return true;
> > +
> > +	return por_el0_allows_pkey(vma_pkey(vma), write, execute);
> > +}
> > +
> >   #include <asm-generic/mmu_context.h>
> >   #endif /* !__ASSEMBLY__ */
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index 2449e4e27ea6..8ee68ff03016 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -34,6 +34,7 @@
> >   #include <asm/cmpxchg.h>
> >   #include <asm/fixmap.h>
> > +#include <asm/por.h>
> >   #include <linux/mmdebug.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/sched.h>
> > @@ -153,6 +154,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
> >   #define pte_accessible(mm, pte)	\
> >   	(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
> > +static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
> > +{
> > +	u64 por;
> > +
> > +	if (!system_supports_poe())
> > +		return true;
> > +
> > +	por = read_sysreg_s(SYS_POR_EL0);
> > +
> > +	if (write)
> > +		return por_elx_allows_write(por, pkey);
> > +
> > +	if (execute)
> > +		return por_elx_allows_exec(por, pkey);
> > +
> > +	return por_elx_allows_read(por, pkey);
> > +}
> > +
> >   /*
> >    * p??_access_permitted() is true for valid user mappings (PTE_USER
> >    * bit set, subject to the write permission check). For execute-only
> > @@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
> >   #define pte_access_permitted_no_overlay(pte, write) \
> >   	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
> >   #define pte_access_permitted(pte, write) \
> > -	pte_access_permitted_no_overlay(pte, write)
> > +	(pte_access_permitted_no_overlay(pte, write) && \
> > +	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
> >   #define pmd_access_permitted(pmd, write) \
> >   	(pte_access_permitted(pmd_pte(pmd), (write)))
> >   #define pud_access_permitted(pud, write) \
> > diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> > new file mode 100644
> > index 000000000000..a284508a4d02
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/pkeys.h
> > @@ -0,0 +1,110 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2023 Arm Ltd.
> > + *
> > + * Based on arch/x86/include/asm/pkeys.h
> > + */
> > +
> > +#ifndef _ASM_ARM64_PKEYS_H
> > +#define _ASM_ARM64_PKEYS_H
> > +
> > +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
> > +
> > +#define arch_max_pkey() 7
> > +
> > +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> > +		unsigned long init_val);
> > +
> > +static inline bool arch_pkeys_enabled(void)
> > +{
> > +	return false;
> > +}
> > +
> > +static inline int vma_pkey(struct vm_area_struct *vma)
> > +{
> > +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> > +}
> > +
> > +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> > +		int prot, int pkey)
> > +{
> > +	if (pkey != -1)
> > +		return pkey;
> > +
> > +	return vma_pkey(vma);
> > +}
> > +
> > +static inline int execute_only_pkey(struct mm_struct *mm)
> > +{
> > +	// Execute-only mappings are handled by EPAN/FEAT_PAN3.
> > +	WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN));
> > +
> > +	return -1;
> > +}
> > +
> > +#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)
> > +#define mm_set_pkey_allocated(mm, pkey) do {		\
> > +	mm_pkey_allocation_map(mm) |= (1U << pkey);	\
> > +} while (0)
> > +#define mm_set_pkey_free(mm, pkey) do {			\
> > +	mm_pkey_allocation_map(mm) &= ~(1U << pkey);	\
> > +} while (0)
> > +
> > +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> > +{
> > +	/*
> > +	 * "Allocated" pkeys are those that have been returned
> > +	 * from pkey_alloc() or pkey 0 which is allocated
> > +	 * implicitly when the mm is created.
> > +	 */
> > +	if (pkey < 0)
> > +		return false;
> > +	if (pkey >= arch_max_pkey())
> > +		return false;
> > +
> > +	return mm_pkey_allocation_map(mm) & (1U << pkey);
> > +}
> > +
> > +/*
> > + * Returns a positive, 3-bit key on success, or -1 on failure.
> > + */
> > +static inline int mm_pkey_alloc(struct mm_struct *mm)
> > +{
> > +	/*
> > +	 * Note: this is the one and only place we make sure
> > +	 * that the pkey is valid as far as the hardware is
> > +	 * concerned.  The rest of the kernel trusts that
> > +	 * only good, valid pkeys come out of here.
> > +	 */
> > +	u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1);
> > +	int ret;
> > +
> > +	if (!arch_pkeys_enabled())
> > +		return -1;
> > +
> > +	/*
> > +	 * Are we out of pkeys?  We must handle this specially
> > +	 * because ffz() behavior is undefined if there are no
> > +	 * zeros.
> > +	 */
> > +	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
> > +		return -1;
> > +
> > +	ret = ffz(mm_pkey_allocation_map(mm));
> > +
> > +	mm_set_pkey_allocated(mm, ret);
> > +
> > +	return ret;
> > +}
> > +
> > +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> > +{
> > +	if (!mm_pkey_is_allocated(mm, pkey))
> > +		return -EINVAL;
> > +
> > +	mm_set_pkey_free(mm, pkey);
> > +
> > +	return 0;
> > +}
> > +
> > +#endif /* _ASM_ARM64_PKEYS_H */
> > diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
> > new file mode 100644
> > index 000000000000..d6604e0c5c54
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/por.h
> > @@ -0,0 +1,33 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (C) 2023 Arm Ltd.
> > + */
> > +
> > +#ifndef _ASM_ARM64_POR_H
> > +#define _ASM_ARM64_POR_H
> > +
> > +#define POR_BITS_PER_PKEY		4
> > +#define POR_ELx_IDX(por_elx, idx)	(((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf)
> > +
> > +static inline bool por_elx_allows_read(u64 por, u8 pkey)
> > +{
> > +	u8 perm = POR_ELx_IDX(por, pkey);
> > +
> > +	return perm & POE_R;
> > +}
> > +
> > +static inline bool por_elx_allows_write(u64 por, u8 pkey)
> > +{
> > +	u8 perm = POR_ELx_IDX(por, pkey);
> > +
> > +	return perm & POE_W;
> > +}
> > +
> > +static inline bool por_elx_allows_exec(u64 por, u8 pkey)
> > +{
> > +	u8 perm = POR_ELx_IDX(por, pkey);
> > +
> > +	return perm & POE_X;
> > +}
> > +
> > +#endif /* _ASM_ARM64_POR_H */
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 495b732d5af3..e50ccc86d150 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -25,6 +25,7 @@
> >   #include <linux/vmalloc.h>
> >   #include <linux/set_memory.h>
> >   #include <linux/kfence.h>
> > +#include <linux/pkeys.h>
> >   #include <asm/barrier.h>
> >   #include <asm/cputype.h>
> > @@ -1535,3 +1536,42 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
> >   	cpu_uninstall_idmap();
> >   }
> > +
> > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> > +{
> > +	u64 new_por = POE_RXW;
> > +	u64 old_por;
> > +	u64 pkey_shift;
> > +
> > +	if (!arch_pkeys_enabled())
> > +		return -ENOSPC;
> > +
> > +	/*
> > +	 * This code should only be called with valid 'pkey'
> > +	 * values originating from in-kernel users.  Complain
> > +	 * if a bad value is observed.
> > +	 */
> > +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> > +		return -EINVAL;
> > +
> > +	/* Set the bits we need in POR:  */
> > +	if (init_val & PKEY_DISABLE_ACCESS)
> > +		new_por = POE_X;
> > +	else if (init_val & PKEY_DISABLE_WRITE)
> > +		new_por = POE_RX;
> > +
> > +	/* Shift the bits in to the correct place in POR for pkey: */
> > +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> > +	new_por <<= pkey_shift;
> > +
> > +	/* Get old POR and mask off any old bits in place: */
> > +	old_por = read_sysreg_s(SYS_POR_EL0);
> > +	old_por &= ~(POE_MASK << pkey_shift);
> > +
> > +	/* Write old part along with new part: */
> > +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> > +
> > +	return 0;
> > +}
> > +#endif

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 00/29] arm64: Permission Overlay Extension
  2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
                   ` (29 preceding siblings ...)
  2024-05-05 14:41 ` [PATCH v4 00/29] arm64: Permission Overlay Extension Mark Brown
@ 2024-05-28 11:30 ` Joey Gouly
  30 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-05-28 11:30 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:18PM +0100, Joey Gouly wrote:
> Hi all,
> 
> This series implements the Permission Overlay Extension introduced in 2022
> VMSA enhancements [1]. It is based on v6.9-rc5.
> 
> One possible issue with this version, I took the last bit of HWCAP2.
> 
> Changes since v3[2]:
> 	- Moved Kconfig to nearer the end of the series
> 	- Reworked MMU Fault path, to check for POE faults earlier, under the mm lock
> 	- Rework VM_FLAGS to use Kconfig option
> 	- Don't check POR_EL0 in MTE sync tags function
> 	- Reworked KVM to fit into VNCR/VM configuration changes
> 	- Use new AT instruction in KVM
> 	- Rebase onto v6.9-rc5
> 
> The Permission Overlay Extension allows to constrain permissions on memory
> regions. This can be used from userspace (EL0) without a system call or TLB
> invalidation.
> 
> POE is used to implement the Memory Protection Keys [3] Linux syscall.
> 
> The first few patches add the basic framework, then the PKEYS interface is
> implemented, and then the selftests are made to work on arm64.
> 
> I have tested the modified protection_keys test on x86_64, but not PPC.
> I haven't build tested the x86/ppc arch changes.
> 
> Thanks,
> Joey

I found a silly off by one error, so I will be sending a v5 at some point.

> 
> Joey Gouly (29):
>   powerpc/mm: add ARCH_PKEY_BITS to Kconfig
>   x86/mm: add ARCH_PKEY_BITS to Kconfig
>   mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN
>   arm64: disable trapping of POR_EL0 to EL2
>   arm64: cpufeature: add Permission Overlay Extension cpucap
>   arm64: context switch POR_EL0 register
>   KVM: arm64: Save/restore POE registers
>   KVM: arm64: make kvm_at() take an OP_AT_*
>   KVM: arm64: use `at s1e1a` for POE
>   arm64: enable the Permission Overlay Extension for EL0
>   arm64: re-order MTE VM_ flags
>   arm64: add POIndex defines
>   arm64: convert protection key into vm_flags and pgprot values
>   arm64: mask out POIndex when modifying a PTE
>   arm64: handle PKEY/POE faults
>   arm64: add pte_access_permitted_no_overlay()
>   arm64: implement PKEYS support
>   arm64: add POE signal support
>   arm64: enable PKEY support for CPUs with S1POE
>   arm64: enable POE and PIE to coexist
>   arm64/ptrace: add support for FEAT_POE
>   arm64: add Permission Overlay Extension Kconfig
>   kselftest/arm64: move get_header()
>   selftests: mm: move fpregs printing
>   selftests: mm: make protection_keys test work on arm64
>   kselftest/arm64: add HWCAP test for FEAT_S1POE
>   kselftest/arm64: parse POE_MAGIC in a signal frame
>   kselftest/arm64: Add test case for POR_EL0 signal frame records
>   KVM: selftests: get-reg-list: add Permission Overlay registers
> 
>  Documentation/arch/arm64/elf_hwcaps.rst       |   2 +
>  arch/arm64/Kconfig                            |  22 +++
>  arch/arm64/include/asm/cpufeature.h           |   6 +
>  arch/arm64/include/asm/el2_setup.h            |  10 +-
>  arch/arm64/include/asm/hwcap.h                |   1 +
>  arch/arm64/include/asm/kvm_asm.h              |   3 +-
>  arch/arm64/include/asm/kvm_host.h             |   4 +
>  arch/arm64/include/asm/mman.h                 |   8 +-
>  arch/arm64/include/asm/mmu.h                  |   1 +
>  arch/arm64/include/asm/mmu_context.h          |  51 ++++++-
>  arch/arm64/include/asm/pgtable-hwdef.h        |  10 ++
>  arch/arm64/include/asm/pgtable-prot.h         |   8 +-
>  arch/arm64/include/asm/pgtable.h              |  34 ++++-
>  arch/arm64/include/asm/pkeys.h                | 110 ++++++++++++++
>  arch/arm64/include/asm/por.h                  |  33 +++++
>  arch/arm64/include/asm/processor.h            |   1 +
>  arch/arm64/include/asm/sysreg.h               |   3 +
>  arch/arm64/include/asm/traps.h                |   1 +
>  arch/arm64/include/asm/vncr_mapping.h         |   1 +
>  arch/arm64/include/uapi/asm/hwcap.h           |   1 +
>  arch/arm64/include/uapi/asm/sigcontext.h      |   7 +
>  arch/arm64/kernel/cpufeature.c                |  23 +++
>  arch/arm64/kernel/cpuinfo.c                   |   1 +
>  arch/arm64/kernel/process.c                   |  28 ++++
>  arch/arm64/kernel/ptrace.c                    |  46 ++++++
>  arch/arm64/kernel/signal.c                    |  52 +++++++
>  arch/arm64/kernel/traps.c                     |  12 +-
>  arch/arm64/kvm/hyp/include/hyp/fault.h        |   5 +-
>  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h    |  29 ++++
>  arch/arm64/kvm/sys_regs.c                     |   8 +-
>  arch/arm64/mm/fault.c                         |  56 ++++++-
>  arch/arm64/mm/mmap.c                          |   9 ++
>  arch/arm64/mm/mmu.c                           |  40 +++++
>  arch/arm64/tools/cpucaps                      |   1 +
>  arch/powerpc/Kconfig                          |   4 +
>  arch/x86/Kconfig                              |   4 +
>  fs/proc/task_mmu.c                            |   2 +
>  include/linux/mm.h                            |  20 ++-
>  include/uapi/linux/elf.h                      |   1 +
>  tools/testing/selftests/arm64/abi/hwcap.c     |  14 ++
>  .../testing/selftests/arm64/signal/.gitignore |   1 +
>  .../arm64/signal/testcases/poe_siginfo.c      |  86 +++++++++++
>  .../arm64/signal/testcases/testcases.c        |  27 +---
>  .../arm64/signal/testcases/testcases.h        |  28 +++-
>  .../selftests/kvm/aarch64/get-reg-list.c      |  14 ++
>  tools/testing/selftests/mm/Makefile           |   2 +-
>  tools/testing/selftests/mm/pkey-arm64.h       | 139 ++++++++++++++++++
>  tools/testing/selftests/mm/pkey-helpers.h     |   8 +
>  tools/testing/selftests/mm/pkey-powerpc.h     |   3 +
>  tools/testing/selftests/mm/pkey-x86.h         |   4 +
>  tools/testing/selftests/mm/protection_keys.c  | 109 ++++++++++++--
>  51 files changed, 1027 insertions(+), 66 deletions(-)
>  create mode 100644 arch/arm64/include/asm/pkeys.h
>  create mode 100644 arch/arm64/include/asm/por.h
>  create mode 100644 tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
>  create mode 100644 tools/testing/selftests/mm/pkey-arm64.h
> 
> -- 
> 2.25.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 07/29] KVM: arm64: Save/restore POE registers
  2024-05-03 13:01 ` [PATCH v4 07/29] KVM: arm64: Save/restore POE registers Joey Gouly
@ 2024-05-29 15:43   ` Marc Zyngier
  2024-08-16 14:55   ` Marc Zyngier
  1 sibling, 0 replies; 146+ messages in thread
From: Marc Zyngier @ 2024-05-29 15:43 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, 03 May 2024 14:01:25 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> Define the new system registers that POE introduces and context switch them.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_host.h          |  4 +++
>  arch/arm64/include/asm/vncr_mapping.h      |  1 +
>  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 29 ++++++++++++++++++++++
>  arch/arm64/kvm/sys_regs.c                  |  8 ++++--
>  4 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9e8a496fb284..28042da0befd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -419,6 +419,8 @@ enum vcpu_sysreg {
>  	GCR_EL1,	/* Tag Control Register */
>  	TFSRE0_EL1,	/* Tag Fault Status Register (EL0) */
>  
> +	POR_EL0,	/* Permission Overlay Register 0 (EL0) */
> +
>  	/* 32bit specific registers. */
>  	DACR32_EL2,	/* Domain Access Control Register */
>  	IFSR32_EL2,	/* Instruction Fault Status Register */
> @@ -489,6 +491,8 @@ enum vcpu_sysreg {
>  	VNCR(PIR_EL1),	 /* Permission Indirection Register 1 (EL1) */
>  	VNCR(PIRE0_EL1), /*  Permission Indirection Register 0 (EL1) */
>  
> +	VNCR(POR_EL1),	/* Permission Overlay Register 1 (EL1) */
> +
>  	VNCR(HFGRTR_EL2),
>  	VNCR(HFGWTR_EL2),
>  	VNCR(HFGITR_EL2),
> diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h
> index df2c47c55972..06f8ec0906a6 100644
> --- a/arch/arm64/include/asm/vncr_mapping.h
> +++ b/arch/arm64/include/asm/vncr_mapping.h
> @@ -52,6 +52,7 @@
>  #define VNCR_PIRE0_EL1		0x290
>  #define VNCR_PIRE0_EL2		0x298
>  #define VNCR_PIR_EL1		0x2A0
> +#define VNCR_POR_EL1		0x2A8
>  #define VNCR_ICH_LR0_EL2        0x400
>  #define VNCR_ICH_LR1_EL2        0x408
>  #define VNCR_ICH_LR2_EL2        0x410
> diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> index 4be6a7fa0070..1c9536557bae 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> @@ -16,9 +16,15 @@
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  
> +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt);
> +
>  static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
>  {
>  	ctxt_sys_reg(ctxt, MDSCR_EL1)	= read_sysreg(mdscr_el1);
> +
> +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> +	if (ctxt_has_s1poe(ctxt))
> +		ctxt_sys_reg(ctxt, POR_EL0)	= read_sysreg_s(SYS_POR_EL0);
>  }
>  
>  static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
> @@ -55,6 +61,17 @@ static inline bool ctxt_has_s1pie(struct kvm_cpu_context *ctxt)
>  	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1PIE, IMP);
>  }
>  
> +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt)
> +{
> +	struct kvm_vcpu *vcpu;
> +
> +	if (!system_supports_poe())
> +		return false;
> +
> +	vcpu = ctxt_to_vcpu(ctxt);
> +	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1POE, IMP);
> +}
> +
>  static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  {
>  	ctxt_sys_reg(ctxt, SCTLR_EL1)	= read_sysreg_el1(SYS_SCTLR);
> @@ -77,6 +94,10 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  		ctxt_sys_reg(ctxt, PIR_EL1)	= read_sysreg_el1(SYS_PIR);
>  		ctxt_sys_reg(ctxt, PIRE0_EL1)	= read_sysreg_el1(SYS_PIRE0);
>  	}
> +
> +	if (ctxt_has_s1poe(ctxt))
> +		ctxt_sys_reg(ctxt, POR_EL1)	= read_sysreg_el1(SYS_POR);
> +

Since you are hacking around here, could you please make the
save/restore of TCR2_EL1 conditional on FEAT_TCR2 being advertised
instead of just checking what's on the host?

Given that this feature is implied by both S1PIE and S1POE, you'd just
have to have some local flag. Doesn't have to be part of this patch
either.

>  	ctxt_sys_reg(ctxt, PAR_EL1)	= read_sysreg_par();
>  	ctxt_sys_reg(ctxt, TPIDR_EL1)	= read_sysreg(tpidr_el1);
>  
> @@ -107,6 +128,10 @@ static inline void __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
>  static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
>  {
>  	write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1),  mdscr_el1);
> +
> +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> +	if (ctxt_has_s1poe(ctxt))
> +		write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0),	SYS_POR_EL0);
>  }
>  
>  static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
> @@ -153,6 +178,10 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1),	SYS_PIR);
>  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1),	SYS_PIRE0);
>  	}
> +
> +	if (ctxt_has_s1poe(ctxt))
> +		write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1),	SYS_POR);
> +
>  	write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),	par_el1);
>  	write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),	tpidr_el1);
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index c9f4f387155f..be04fae35afb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2423,6 +2423,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
>  	{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
>  	{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
> +	{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },
>  	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
>  
>  	{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
> @@ -2506,6 +2507,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	  .access = access_pmovs, .reg = PMOVSSET_EL0,
>  	  .get_user = get_pmreg, .set_user = set_pmreg },
>  
> +	{ SYS_DESC(SYS_POR_EL0), NULL, reset_unknown, POR_EL0 },
>  	{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
>  	{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
>  	{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
> @@ -4057,8 +4059,6 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
>  	kvm->arch.fgu[HFGxTR_GROUP] = (HFGxTR_EL2_nAMAIR2_EL1		|
>  				       HFGxTR_EL2_nMAIR2_EL1		|
>  				       HFGxTR_EL2_nS2POR_EL1		|
> -				       HFGxTR_EL2_nPOR_EL1		|
> -				       HFGxTR_EL2_nPOR_EL0		|
>  				       HFGxTR_EL2_nACCDATA_EL1		|
>  				       HFGxTR_EL2_nSMPRI_EL1_MASK	|
>  				       HFGxTR_EL2_nTPIDR2_EL0_MASK);
> @@ -4093,6 +4093,10 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
>  		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPIRE0_EL1 |
>  						HFGxTR_EL2_nPIR_EL1);
>  
> +	if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
> +		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
> +						HFGxTR_EL2_nPOR_EL0);
> +
>  	if (!kvm_has_feat(kvm, ID_AA64PFR0_EL1, AMU, IMP))
>  		kvm->arch.fgu[HAFGRTR_GROUP] |= ~(HAFGRTR_EL2_RES0 |
>  						  HAFGRTR_EL2_RES1);

Otherwise, looks good.

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_*
  2024-05-03 13:01 ` [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_* Joey Gouly
@ 2024-05-29 15:46   ` Marc Zyngier
  2024-07-15  8:36   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Marc Zyngier @ 2024-05-29 15:46 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, 03 May 2024 14:01:26 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> To allow using newer instructions that current assemblers don't know about,
> replace the `at` instruction with the underlying SYS instruction.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_asm.h       | 3 ++-
>  arch/arm64/kvm/hyp/include/hyp/fault.h | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 24b5e6b23417..ce65fd0f01b0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -10,6 +10,7 @@
>  #include <asm/hyp_image.h>
>  #include <asm/insn.h>
>  #include <asm/virt.h>
> +#include <asm/sysreg.h>

nit: include order.

>  
>  #define ARM_EXIT_WITH_SERROR_BIT  31
>  #define ARM_EXCEPTION_CODE(x)	  ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
> @@ -261,7 +262,7 @@ extern u64 __kvm_get_mdcr_el2(void);
>  	asm volatile(							\
>  	"	mrs	%1, spsr_el2\n"					\
>  	"	mrs	%2, elr_el2\n"					\
> -	"1:	at	"at_op", %3\n"					\
> +	"1:	" __msr_s(at_op, "%3") "\n"				\
>  	"	isb\n"							\
>  	"	b	9f\n"						\
>  	"2:	msr	spsr_el2, %1\n"					\
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> index 9e13c1bc2ad5..487c06099d6f 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/fault.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -27,7 +27,7 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  	 * saved the guest context yet, and we may return early...
>  	 */
>  	par = read_sysreg_par();
> -	if (!__kvm_at("s1e1r", far))
> +	if (!__kvm_at(OP_AT_S1E1R, far))
>  		tmp = read_sysreg_par();
>  	else
>  		tmp = SYS_PAR_EL1_F; /* back to the guest */

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE
  2024-05-03 13:01 ` [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE Joey Gouly
@ 2024-05-29 15:50   ` Marc Zyngier
  2024-07-15  8:45   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Marc Zyngier @ 2024-05-29 15:50 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, 03 May 2024 14:01:27 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> FEAT_ATS1E1A introduces a new instruction: `at s1e1a`.
> This is an address translation, without permission checks.
> 
> POE allows read permissions to be removed from S1 by the guest.  This means
> that an `at` instruction could fail, and not get the IPA.
> 
> Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1
> permissions.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/hyp/include/hyp/fault.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> index 487c06099d6f..17df94570f03 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/fault.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -14,6 +14,7 @@
>  
>  static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  {
> +	int ret;
>  	u64 par, tmp;
>  
>  	/*
> @@ -27,7 +28,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  	 * saved the guest context yet, and we may return early...
>  	 */
>  	par = read_sysreg_par();
> -	if (!__kvm_at(OP_AT_S1E1R, far))
> +	ret = system_supports_poe() ? __kvm_at(OP_AT_S1E1A, far) :
> +	                              __kvm_at(OP_AT_S1E1R, far);
> +	if (!ret)
>  		tmp = read_sysreg_par();
>  	else
>  		tmp = SYS_PAR_EL1_F; /* back to the guest */

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records
  2024-05-03 13:01 ` [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records Joey Gouly
@ 2024-05-29 15:51   ` Mark Brown
  2024-07-05 19:34     ` Shuah Khan
  2024-07-09 13:10   ` Kevin Brodsky
  1 sibling, 1 reply; 146+ messages in thread
From: Mark Brown @ 2024-05-29 15:51 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 201 bytes --]

On Fri, May 03, 2024 at 02:01:46PM +0100, Joey Gouly wrote:
> Ensure that we get signal context for POR_EL0 if and only if POE is present
> on the system.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
  2024-05-28  6:55   ` Amit Daniel Kachhap
@ 2024-05-31 14:57   ` Szabolcs Nagy
  2024-05-31 15:21     ` Joey Gouly
  2024-07-05 16:59   ` Catalin Marinas
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 146+ messages in thread
From: Szabolcs Nagy @ 2024-05-31 14:57 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, tglx, will, x86, kvmarm

The 05/03/2024 14:01, Joey Gouly wrote:
> Implement the PKEYS interface, using the Permission Overlay Extension.
...
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> +{
> +	u64 new_por = POE_RXW;
> +	u64 old_por;
> +	u64 pkey_shift;
> +
> +	if (!arch_pkeys_enabled())
> +		return -ENOSPC;
> +
> +	/*
> +	 * This code should only be called with valid 'pkey'
> +	 * values originating from in-kernel users.  Complain
> +	 * if a bad value is observed.
> +	 */
> +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> +		return -EINVAL;
> +
> +	/* Set the bits we need in POR:  */
> +	if (init_val & PKEY_DISABLE_ACCESS)
> +		new_por = POE_X;
> +	else if (init_val & PKEY_DISABLE_WRITE)
> +		new_por = POE_RX;
> +

given that the architecture allows r,w,x permissions to be
set independently, should we have a 'PKEY_DISABLE_EXEC' or
similar api flag?

(on other targets it can be some invalid value that fails)

> +	/* Shift the bits in to the correct place in POR for pkey: */
> +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> +	new_por <<= pkey_shift;
> +
> +	/* Get old POR and mask off any old bits in place: */
> +	old_por = read_sysreg_s(SYS_POR_EL0);
> +	old_por &= ~(POE_MASK << pkey_shift);
> +
> +	/* Write old part along with new part: */
> +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> +
> +	return 0;
> +}
> +#endif
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-31 14:57   ` Szabolcs Nagy
@ 2024-05-31 15:21     ` Joey Gouly
  2024-05-31 16:27       ` Szabolcs Nagy
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-05-31 15:21 UTC (permalink / raw)
  To: Szabolcs Nagy, dave.hansen
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

Hi Szabolcs,

On Fri, May 31, 2024 at 03:57:07PM +0100, Szabolcs Nagy wrote:
> The 05/03/2024 14:01, Joey Gouly wrote:
> > Implement the PKEYS interface, using the Permission Overlay Extension.
> ...
> > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> > +{
> > +	u64 new_por = POE_RXW;
> > +	u64 old_por;
> > +	u64 pkey_shift;
> > +
> > +	if (!arch_pkeys_enabled())
> > +		return -ENOSPC;
> > +
> > +	/*
> > +	 * This code should only be called with valid 'pkey'
> > +	 * values originating from in-kernel users.  Complain
> > +	 * if a bad value is observed.
> > +	 */
> > +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> > +		return -EINVAL;
> > +
> > +	/* Set the bits we need in POR:  */
> > +	if (init_val & PKEY_DISABLE_ACCESS)
> > +		new_por = POE_X;
> > +	else if (init_val & PKEY_DISABLE_WRITE)
> > +		new_por = POE_RX;
> > +
> 
> given that the architecture allows r,w,x permissions to be
> set independently, should we have a 'PKEY_DISABLE_EXEC' or
> similar api flag?
> 
> (on other targets it can be some invalid value that fails)

I didn't think about the best way to do that yet. PowerPC has a PKEY_DISABLE_EXECUTE.

We could either make that generic, and X86 has to error if it sees that bit, or
we add a arch-specific PKEY_DISABLE_EXECUTE like PowerPC.

A user can still set it by interacting with the register directly, but I guess
we want something for the glibc interface..

Dave, any thoughts here?

> 
> > +	/* Shift the bits in to the correct place in POR for pkey: */
> > +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> > +	new_por <<= pkey_shift;
> > +
> > +	/* Get old POR and mask off any old bits in place: */
> > +	old_por = read_sysreg_s(SYS_POR_EL0);
> > +	old_por &= ~(POE_MASK << pkey_shift);
> > +
> > +	/* Write old part along with new part: */
> > +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> > +
> > +	return 0;
> > +}
> > +#endif

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-31 15:21     ` Joey Gouly
@ 2024-05-31 16:27       ` Szabolcs Nagy
  2024-06-17 13:40         ` Florian Weimer
  0 siblings, 1 reply; 146+ messages in thread
From: Szabolcs Nagy @ 2024-05-31 16:27 UTC (permalink / raw)
  To: Joey Gouly, dave.hansen
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, tglx, will, x86, kvmarm, Florian Weimer

The 05/31/2024 16:21, Joey Gouly wrote:
> Hi Szabolcs,
> 
> On Fri, May 31, 2024 at 03:57:07PM +0100, Szabolcs Nagy wrote:
> > The 05/03/2024 14:01, Joey Gouly wrote:
> > > Implement the PKEYS interface, using the Permission Overlay Extension.
> > ...
> > > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > > +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> > > +{
> > > +	u64 new_por = POE_RXW;
> > > +	u64 old_por;
> > > +	u64 pkey_shift;
> > > +
> > > +	if (!arch_pkeys_enabled())
> > > +		return -ENOSPC;
> > > +
> > > +	/*
> > > +	 * This code should only be called with valid 'pkey'
> > > +	 * values originating from in-kernel users.  Complain
> > > +	 * if a bad value is observed.
> > > +	 */
> > > +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> > > +		return -EINVAL;
> > > +
> > > +	/* Set the bits we need in POR:  */
> > > +	if (init_val & PKEY_DISABLE_ACCESS)
> > > +		new_por = POE_X;
> > > +	else if (init_val & PKEY_DISABLE_WRITE)
> > > +		new_por = POE_RX;
> > > +
> > 
> > given that the architecture allows r,w,x permissions to be
> > set independently, should we have a 'PKEY_DISABLE_EXEC' or
> > similar api flag?
> > 
> > (on other targets it can be some invalid value that fails)
> 
> I didn't think about the best way to do that yet. PowerPC has a PKEY_DISABLE_EXECUTE.
> 
> We could either make that generic, and X86 has to error if it sees that bit, or
> we add a arch-specific PKEY_DISABLE_EXECUTE like PowerPC.

this does not seem to be in glibc yet. (or in linux man pages)

i guess you can copy whatever ppc does.

> 
> A user can still set it by interacting with the register directly, but I guess
> we want something for the glibc interface..
> 
> Dave, any thoughts here?

adding Florian too, since i found an old thread of his that tried
to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
it did not seem to end up upstream. (this makes more sense to me
as libc api than the weird disable access semantics)

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-28  6:56   ` Amit Daniel Kachhap
@ 2024-05-31 16:39     ` Mark Brown
  2024-06-03  9:21       ` Amit Daniel Kachhap
  0 siblings, 1 reply; 146+ messages in thread
From: Mark Brown @ 2024-05-31 16:39 UTC (permalink / raw)
  To: Amit Daniel Kachhap
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 791 bytes --]

On Tue, May 28, 2024 at 12:26:54PM +0530, Amit Daniel Kachhap wrote:
> On 5/3/24 18:31, Joey Gouly wrote:

> > +#define POE_MAGIC	0x504f4530

> > +struct poe_context {
> > +	struct _aarch64_ctx head;
> > +	__u64 por_el0;
> > +};

> There is a comment section in the beginning which mentions the size
> of the context frame structure and subsequent reduction in the
> reserved range. So this new context description can be added there.
> Although looks like it is broken for za, zt and fpmr context.

Could you be more specific about how you think these existing contexts
are broken?  The above looks perfectly good and standard and the
existing contexts do a reasonable simulation of working.  Note that the
ZA and ZT contexts don't generate data payload unless userspace has set
PSTATE.ZA.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-31 16:39     ` Mark Brown
@ 2024-06-03  9:21       ` Amit Daniel Kachhap
  2024-07-25 15:58         ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Amit Daniel Kachhap @ 2024-06-03  9:21 UTC (permalink / raw)
  To: Mark Brown
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm



On 5/31/24 22:09, Mark Brown wrote:
> On Tue, May 28, 2024 at 12:26:54PM +0530, Amit Daniel Kachhap wrote:
>> On 5/3/24 18:31, Joey Gouly wrote:
> 
>>> +#define POE_MAGIC	0x504f4530
> 
>>> +struct poe_context {
>>> +	struct _aarch64_ctx head;
>>> +	__u64 por_el0;
>>> +};
> 
>> There is a comment section in the beginning which mentions the size
>> of the context frame structure and subsequent reduction in the
>> reserved range. So this new context description can be added there.
>> Although looks like it is broken for za, zt and fpmr context.
> 
> Could you be more specific about how you think these existing contexts
> are broken?  The above looks perfectly good and standard and the
> existing contexts do a reasonable simulation of working.  Note that the
> ZA and ZT contexts don't generate data payload unless userspace has set
> PSTATE.ZA.

Sorry for not being clear on this as I was only referring to the
comments in file arch/arm64/include/uapi/asm/sigcontext.h and no code
as such is broken.

  * Allocation of __reserved[]:
  * (Note: records do not necessarily occur in the order shown here.)
  *
  *      size            description
  *
  *      0x210           fpsimd_context
  *       0x10           esr_context
  *      0x8a0           sve_context (vl <= 64) (optional)
  *       0x20           extra_context (optional)
  *       0x10           terminator (null _aarch64_ctx)
  *
  *      0x510           (reserved for future allocation)

Here I think that optional context like za, zt, fpmr and poe should have
size mentioned here to make the description consistent.As you said ZA
and ZT context are enabled by userspace so some extra details can be
added for them too.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-31 16:27       ` Szabolcs Nagy
@ 2024-06-17 13:40         ` Florian Weimer
  2024-06-17 14:51           ` Szabolcs Nagy
  0 siblings, 1 reply; 146+ messages in thread
From: Florian Weimer @ 2024-06-17 13:40 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Joey Gouly, dave.hansen, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, catalin.marinas, christophe.leroy, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

* Szabolcs Nagy:

>> A user can still set it by interacting with the register directly, but I guess
>> we want something for the glibc interface..
>> 
>> Dave, any thoughts here?
>
> adding Florian too, since i found an old thread of his that tried
> to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
> it did not seem to end up upstream. (this makes more sense to me
> as libc api than the weird disable access semantics)

I still think it makes sense to have a full complenent of PKEY_* flags
complementing the PROT_* flags, in a somewhat abstract fashion for
pkey_alloc only.  The internal protection mask register encoding will
differ from architecture to architecture, but the abstract glibc
functions pkey_set and pkey_get could use them (if we are a bit
careful).

Thanks,
Florian


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-06-17 13:40         ` Florian Weimer
@ 2024-06-17 14:51           ` Szabolcs Nagy
  2024-07-08 17:53             ` Catalin Marinas
  0 siblings, 1 reply; 146+ messages in thread
From: Szabolcs Nagy @ 2024-06-17 14:51 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Joey Gouly, dave.hansen, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, catalin.marinas, christophe.leroy, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

The 06/17/2024 15:40, Florian Weimer wrote:
> >> A user can still set it by interacting with the register directly, but I guess
> >> we want something for the glibc interface..
> >> 
> >> Dave, any thoughts here?
> >
> > adding Florian too, since i found an old thread of his that tried
> > to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
> > it did not seem to end up upstream. (this makes more sense to me
> > as libc api than the weird disable access semantics)
> 
> I still think it makes sense to have a full complenent of PKEY_* flags
> complementing the PROT_* flags, in a somewhat abstract fashion for
> pkey_alloc only.  The internal protection mask register encoding will
> differ from architecture to architecture, but the abstract glibc
> functions pkey_set and pkey_get could use them (if we are a bit
> careful).

to me it makes sense to have abstract

PKEY_DISABLE_READ
PKEY_DISABLE_WRITE
PKEY_DISABLE_EXECUTE
PKEY_DISABLE_ACCESS

where access is handled like

if (flags&PKEY_DISABLE_ACCESS)
	flags |= PKEY_DISABLE_READ|PKEY_DISABLE_WRITE;
disable_read = flags&PKEY_DISABLE_READ;
disable_write = flags&PKEY_DISABLE_WRITE;
disable_exec = flags&PKEY_DISABLE_EXECUTE;

if there are unsupported combinations like
disable_read&&!disable_write then those are rejected
by pkey_alloc and pkey_set.

this allows portable use of pkey apis.
(the flags could be target specific, but don't have to be)

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-05-28  6:54   ` Amit Daniel Kachhap
@ 2024-06-19 16:45     ` Catalin Marinas
  2024-07-04 12:47       ` Joey Gouly
  0 siblings, 1 reply; 146+ messages in thread
From: Catalin Marinas @ 2024-06-19 16:45 UTC (permalink / raw)
  To: Amit Daniel Kachhap
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, christophe.leroy, dave.hansen, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Tue, May 28, 2024 at 12:24:57PM +0530, Amit Daniel Kachhap wrote:
> On 5/3/24 18:31, Joey Gouly wrote:
> > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > index 5966ee4a6154..ecb2d18dc4d7 100644
> > --- a/arch/arm64/include/asm/mman.h
> > +++ b/arch/arm64/include/asm/mman.h
> > @@ -7,7 +7,7 @@
> >   #include <uapi/asm/mman.h>
> >   static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > -	unsigned long pkey __always_unused)
> > +	unsigned long pkey)
> >   {
> >   	unsigned long ret = 0;
> > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >   	if (system_supports_mte() && (prot & PROT_MTE))
> >   		ret |= VM_MTE;
> > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> 
> Should there be system_supports_poe() check like above?

I think it should, otherwise we end up with these bits in the pte even
when POE is not supported.

> > +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> > +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> > +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> > +#endif
> > +
> >   	return ret;
> >   }
> >   #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
@ 2024-06-21 16:57   ` Catalin Marinas
  2024-07-09 13:03   ` Kevin Brodsky
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 16:57 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> @@ -529,6 +547,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
>  	unsigned long addr = untagged_addr(far);
>  	struct vm_area_struct *vma;
> +	bool pkey_fault = false;
> +	int pkey = -1;
>  
>  	if (kprobe_page_fault(regs, esr))
>  		return 0;
> @@ -590,6 +610,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		vma_end_read(vma);
>  		goto lock_mmap;
>  	}
> +
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		vma_end_read(vma);
> +		goto lock_mmap;
> +	}
> +
>  	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>  	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
>  		vma_end_read(vma);
> @@ -617,6 +643,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		goto done;
>  	}
>  
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		pkey_fault = true;
> +		pkey = vma_pkey(vma);
> +	}

I was wondering if we actually need to test this again. We know the
fault was from a pkey already above but I guess it matches what we do
with the vma->vm_flags check in case it races with some mprotect() call.

> +
>  	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);

You'll need to rebase this on 6.10-rcX since this function disappeared.

Otherwise the patch looks fine.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap
  2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
@ 2024-06-21 16:58   ` Catalin Marinas
  2024-06-21 17:01   ` Catalin Marinas
  2024-07-15  7:47   ` Anshuman Khandual
  2 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 16:58 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote:
> This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
> as the boot CPU will enable POE if it has it, so secondary CPUs must also
> have this feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Adding some acks otherwise I'll forget what I reviewed.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap
  2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
  2024-06-21 16:58   ` Catalin Marinas
@ 2024-06-21 17:01   ` Catalin Marinas
  2024-06-21 17:02     ` Catalin Marinas
  2024-07-15  7:47   ` Anshuman Khandual
  2 siblings, 1 reply; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:01 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote:
> This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
> as the boot CPU will enable POE if it has it, so secondary CPUs must also
> have this feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap
  2024-06-21 17:01   ` Catalin Marinas
@ 2024-06-21 17:02     ` Catalin Marinas
  0 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:02 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, Jun 21, 2024 at 06:01:52PM +0100, Catalin Marinas wrote:
> On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote:
> > This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
> > as the boot CPU will enable POE if it has it, so secondary CPUs must also
> > have this feature.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

One ack is sufficient, ignore this one ;)

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
@ 2024-06-21 17:03   ` Catalin Marinas
  2024-06-21 17:07   ` Catalin Marinas
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:03 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:24PM +0100, Joey Gouly wrote:
> POR_EL0 is a register that can be modified by userspace directly,
> so it must be context switched.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
@ 2024-06-21 17:04   ` Catalin Marinas
  2024-07-15  9:13   ` Anshuman Khandual
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:04 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote:
> Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
> check if the CPU supports the feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 11/29] arm64: re-order MTE VM_ flags
  2024-05-03 13:01 ` [PATCH v4 11/29] arm64: re-order MTE VM_ flags Joey Gouly
@ 2024-06-21 17:04   ` Catalin Marinas
  2024-07-15  9:21   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:04 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:29PM +0100, Joey Gouly wrote:
> To make it easier to share the generic PKEYs flags, move the MTE flag.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 12/29] arm64: add POIndex defines
  2024-05-03 13:01 ` [PATCH v4 12/29] arm64: add POIndex defines Joey Gouly
@ 2024-06-21 17:05   ` Catalin Marinas
  2024-07-15  9:26   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:05 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:30PM +0100, Joey Gouly wrote:
> The 3-bit POIndex is stored in the PTE at bits 60..62.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
  2024-06-21 17:03   ` Catalin Marinas
@ 2024-06-21 17:07   ` Catalin Marinas
  2024-07-15  8:27   ` Anshuman Khandual
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:07 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:24PM +0100, Joey Gouly wrote:
> +static void flush_poe(void)
> +{
> +	if (!system_supports_poe())
> +		return;
> +
> +	write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> +	/* ISB required for kernel uaccess routines when chaning POR_EL0 */

Nit: s/chaning/changing/

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay()
  2024-05-03 13:01 ` [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay() Joey Gouly
@ 2024-06-21 17:15   ` Catalin Marinas
  2024-07-16 10:21   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:15 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:34PM +0100, Joey Gouly wrote:
> We do not want take POE into account when clearing the MTE tags.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 20/29] arm64: enable POE and PIE to coexist
  2024-05-03 13:01 ` [PATCH v4 20/29] arm64: enable POE and PIE to coexist Joey Gouly
@ 2024-06-21 17:16   ` Catalin Marinas
  2024-07-16 10:41   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-06-21 17:16 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:38PM +0100, Joey Gouly wrote:
> Set the EL0/userspace indirection encodings to be the overlay enabled
> variants of the permissions.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-06-19 16:45     ` Catalin Marinas
@ 2024-07-04 12:47       ` Joey Gouly
  2024-07-08 17:22         ` Catalin Marinas
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-07-04 12:47 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Amit Daniel Kachhap, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Wed, Jun 19, 2024 at 05:45:29PM +0100, Catalin Marinas wrote:
> On Tue, May 28, 2024 at 12:24:57PM +0530, Amit Daniel Kachhap wrote:
> > On 5/3/24 18:31, Joey Gouly wrote:
> > > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > > index 5966ee4a6154..ecb2d18dc4d7 100644
> > > --- a/arch/arm64/include/asm/mman.h
> > > +++ b/arch/arm64/include/asm/mman.h
> > > @@ -7,7 +7,7 @@
> > >   #include <uapi/asm/mman.h>
> > >   static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > -	unsigned long pkey __always_unused)
> > > +	unsigned long pkey)
> > >   {
> > >   	unsigned long ret = 0;
> > > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > >   	if (system_supports_mte() && (prot & PROT_MTE))
> > >   		ret |= VM_MTE;
> > > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> > 
> > Should there be system_supports_poe() check like above?
> 
> I think it should, otherwise we end up with these bits in the pte even
> when POE is not supported.

I think it can't get here due to the flow of the code, but I will add it to be
defensive (since it's just an alternative that gets patched).
I still need the defined(CONFIG_ARCH_HAS_PKEYS) check, since the VM_PKEY_BIT*
are only defined then.

> 
> > > +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> > > +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> > > +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> > > +#endif
> > > +
> > >   	return ret;
> > >   }
> > >   #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
  2024-05-28  6:55   ` Amit Daniel Kachhap
  2024-05-31 14:57   ` Szabolcs Nagy
@ 2024-07-05 16:59   ` Catalin Marinas
  2024-07-22 13:39     ` Kevin Brodsky
  2024-07-09 13:07   ` Kevin Brodsky
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 146+ messages in thread
From: Catalin Marinas @ 2024-07-05 16:59 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:35PM +0100, Joey Gouly wrote:
> @@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>  #define pte_access_permitted_no_overlay(pte, write) \
>  	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
>  #define pte_access_permitted(pte, write) \
> -	pte_access_permitted_no_overlay(pte, write)
> +	(pte_access_permitted_no_overlay(pte, write) && \
> +	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))

I'm still not entirely convinced on checking the keys during fast GUP
but that's what x86 and powerpc do already, so I guess we'll follow the
same ABI.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
  2024-05-28  6:56   ` Amit Daniel Kachhap
@ 2024-07-05 17:04   ` Catalin Marinas
  2024-07-09 13:08   ` Kevin Brodsky
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-07-05 17:04 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig
  2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
@ 2024-07-05 17:05   ` Catalin Marinas
  2024-07-09 13:08   ` Kevin Brodsky
  2024-07-16 11:02   ` Anshuman Khandual
  2 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-07-05 17:05 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:40PM +0100, Joey Gouly wrote:
> Now that support for POE and Protection Keys has been implemented, add a
> config to allow users to actually enable it.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records
  2024-05-29 15:51   ` Mark Brown
@ 2024-07-05 19:34     ` Shuah Khan
  0 siblings, 0 replies; 146+ messages in thread
From: Shuah Khan @ 2024-07-05 19:34 UTC (permalink / raw)
  To: Mark Brown, Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm, Shuah Khan

On 5/29/24 09:51, Mark Brown wrote:
> On Fri, May 03, 2024 at 02:01:46PM +0100, Joey Gouly wrote:
>> Ensure that we get signal context for POR_EL0 if and only if POE is present
>> on the system.
> 
> Reviewed-by: Mark Brown <broonie@kernel.org>

For kselftest:

Acked-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-07-04 12:47       ` Joey Gouly
@ 2024-07-08 17:22         ` Catalin Marinas
  0 siblings, 0 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-07-08 17:22 UTC (permalink / raw)
  To: Joey Gouly
  Cc: Amit Daniel Kachhap, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 04, 2024 at 01:47:04PM +0100, Joey Gouly wrote:
> On Wed, Jun 19, 2024 at 05:45:29PM +0100, Catalin Marinas wrote:
> > On Tue, May 28, 2024 at 12:24:57PM +0530, Amit Daniel Kachhap wrote:
> > > On 5/3/24 18:31, Joey Gouly wrote:
> > > > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > > > index 5966ee4a6154..ecb2d18dc4d7 100644
> > > > --- a/arch/arm64/include/asm/mman.h
> > > > +++ b/arch/arm64/include/asm/mman.h
> > > > @@ -7,7 +7,7 @@
> > > >   #include <uapi/asm/mman.h>
> > > >   static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > > -	unsigned long pkey __always_unused)
> > > > +	unsigned long pkey)
> > > >   {
> > > >   	unsigned long ret = 0;
> > > > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > >   	if (system_supports_mte() && (prot & PROT_MTE))
> > > >   		ret |= VM_MTE;
> > > > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> > > 
> > > Should there be system_supports_poe() check like above?
> > 
> > I think it should, otherwise we end up with these bits in the pte even
> > when POE is not supported.
> 
> I think it can't get here due to the flow of the code, but I will add it to be
> defensive (since it's just an alternative that gets patched).

You are probably right, the mprotect_pkey() will reject the call if we
don't support POE. So you could add a comment instead (but a
system_supports_poe() check seems safer).

> I still need the defined(CONFIG_ARCH_HAS_PKEYS) check, since the VM_PKEY_BIT*
> are only defined then.

Yes, the ifdef will stay.

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-06-17 14:51           ` Szabolcs Nagy
@ 2024-07-08 17:53             ` Catalin Marinas
  2024-07-09  8:32               ` Szabolcs Nagy
  2024-07-11  9:50               ` Joey Gouly
  0 siblings, 2 replies; 146+ messages in thread
From: Catalin Marinas @ 2024-07-08 17:53 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, Joey Gouly, dave.hansen, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, broonie, christophe.leroy, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

Hi Szabolcs,

On Mon, Jun 17, 2024 at 03:51:35PM +0100, Szabolcs Nagy wrote:
> The 06/17/2024 15:40, Florian Weimer wrote:
> > >> A user can still set it by interacting with the register directly, but I guess
> > >> we want something for the glibc interface..
> > >> 
> > >> Dave, any thoughts here?
> > >
> > > adding Florian too, since i found an old thread of his that tried
> > > to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
> > > it did not seem to end up upstream. (this makes more sense to me
> > > as libc api than the weird disable access semantics)
> > 
> > I still think it makes sense to have a full complenent of PKEY_* flags
> > complementing the PROT_* flags, in a somewhat abstract fashion for
> > pkey_alloc only.  The internal protection mask register encoding will
> > differ from architecture to architecture, but the abstract glibc
> > functions pkey_set and pkey_get could use them (if we are a bit
> > careful).
> 
> to me it makes sense to have abstract
> 
> PKEY_DISABLE_READ
> PKEY_DISABLE_WRITE
> PKEY_DISABLE_EXECUTE
> PKEY_DISABLE_ACCESS
> 
> where access is handled like
> 
> if (flags&PKEY_DISABLE_ACCESS)
> 	flags |= PKEY_DISABLE_READ|PKEY_DISABLE_WRITE;
> disable_read = flags&PKEY_DISABLE_READ;
> disable_write = flags&PKEY_DISABLE_WRITE;
> disable_exec = flags&PKEY_DISABLE_EXECUTE;
> 
> if there are unsupported combinations like
> disable_read&&!disable_write then those are rejected
> by pkey_alloc and pkey_set.
> 
> this allows portable use of pkey apis.
> (the flags could be target specific, but don't have to be)

On powerpc, PKEY_DISABLE_ACCESS also disables execution. AFAICT, the
kernel doesn't define a PKEY_DISABLE_READ, only PKEY_DISABLE_ACCESS so
for powerpc there's no way to to set an execute-only permission via this
interface. I wouldn't like to diverge from powerpc.

However, does it matter much? That's only for the initial setup, the
user can then change the permissions directly via the sysreg. So maybe
we don't need all those combinations upfront. A PKEY_DISABLE_EXECUTE
together with the full PKEY_DISABLE_ACCESS would probably suffice.

Give that on x86 the PKEY_ACCESS_MASK will have to stay as
PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE, we'll probably do the same as
powerpc and define an arm64 specific PKEY_DISABLE_EXECUTE with the
corresponding PKEY_ACCESS_MASK including it. We can generalise the masks
with some ARCH_HAS_PKEY_DISABLE_EXECUTE but it's probably more hassle
than just defining the arm64 PKEY_DISABLE_EXECUTE.

I assume you'd like PKEY_DISABLE_EXECUTE to be part of this series,
otherwise changing PKEY_ACCESS_MASK later will cause potential ABI
issues.

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-08 17:53             ` Catalin Marinas
@ 2024-07-09  8:32               ` Szabolcs Nagy
  2024-07-09  8:52                 ` Florian Weimer
  2024-07-11  9:50               ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Szabolcs Nagy @ 2024-07-09  8:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Florian Weimer, Joey Gouly, dave.hansen, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, broonie, christophe.leroy, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

The 07/08/2024 18:53, Catalin Marinas wrote:
> Hi Szabolcs,
> 
> On Mon, Jun 17, 2024 at 03:51:35PM +0100, Szabolcs Nagy wrote:
> > The 06/17/2024 15:40, Florian Weimer wrote:
> > > >> A user can still set it by interacting with the register directly, but I guess
> > > >> we want something for the glibc interface..
> > > >> 
> > > >> Dave, any thoughts here?
> > > >
> > > > adding Florian too, since i found an old thread of his that tried
> > > > to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
> > > > it did not seem to end up upstream. (this makes more sense to me
> > > > as libc api than the weird disable access semantics)
> > > 
> > > I still think it makes sense to have a full complenent of PKEY_* flags
> > > complementing the PROT_* flags, in a somewhat abstract fashion for
> > > pkey_alloc only.  The internal protection mask register encoding will
> > > differ from architecture to architecture, but the abstract glibc
> > > functions pkey_set and pkey_get could use them (if we are a bit
> > > careful).
> > 
> > to me it makes sense to have abstract
> > 
> > PKEY_DISABLE_READ
> > PKEY_DISABLE_WRITE
> > PKEY_DISABLE_EXECUTE
> > PKEY_DISABLE_ACCESS
> > 
> > where access is handled like
> > 
> > if (flags&PKEY_DISABLE_ACCESS)
> > 	flags |= PKEY_DISABLE_READ|PKEY_DISABLE_WRITE;
> > disable_read = flags&PKEY_DISABLE_READ;
> > disable_write = flags&PKEY_DISABLE_WRITE;
> > disable_exec = flags&PKEY_DISABLE_EXECUTE;
> > 
> > if there are unsupported combinations like
> > disable_read&&!disable_write then those are rejected
> > by pkey_alloc and pkey_set.
> > 
> > this allows portable use of pkey apis.
> > (the flags could be target specific, but don't have to be)
> 
> On powerpc, PKEY_DISABLE_ACCESS also disables execution. AFAICT, the
> kernel doesn't define a PKEY_DISABLE_READ, only PKEY_DISABLE_ACCESS so
> for powerpc there's no way to to set an execute-only permission via this
> interface. I wouldn't like to diverge from powerpc.

the exec permission should be documented in the man.
and i think it should be consistent across targets
to allow portable use.

now ppc and x86 are inconsistent, i think it's not
ideal, but ok to say that targets without disable-exec
support do whatever x86 does with PKEY_DISABLE_ACCESS
otherwise it means whatever ppc does.

> 
> However, does it matter much? That's only for the initial setup, the
> user can then change the permissions directly via the sysreg. So maybe
> we don't need all those combinations upfront. A PKEY_DISABLE_EXECUTE
> together with the full PKEY_DISABLE_ACCESS would probably suffice.

this is ok.

a bit awkward in userspace when the register is directly
set to e.g write-only and pkey_get has to return something,
but we can handle settings outside of valid PKEY_* macros
as unspec, users who want that would use their own register
set/get code.

i would have designed the permission to use either existing
PROT_* flags or say that it is architectural and written to
the register directly and let the libc wrapper deal with
portable api, i guess it's too late now.

(the signal handling behaviour should have a control and it
is possible to fix e.g. via pkey_alloc flags, but that may
not be the best solution and this can be done later.)

> 
> Give that on x86 the PKEY_ACCESS_MASK will have to stay as
> PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE, we'll probably do the same as
> powerpc and define an arm64 specific PKEY_DISABLE_EXECUTE with the
> corresponding PKEY_ACCESS_MASK including it. We can generalise the masks
> with some ARCH_HAS_PKEY_DISABLE_EXECUTE but it's probably more hassle
> than just defining the arm64 PKEY_DISABLE_EXECUTE.
> 
> I assume you'd like PKEY_DISABLE_EXECUTE to be part of this series,
> otherwise changing PKEY_ACCESS_MASK later will cause potential ABI
> issues.

yes i think we should figure this out in the initial support.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-09  8:32               ` Szabolcs Nagy
@ 2024-07-09  8:52                 ` Florian Weimer
  0 siblings, 0 replies; 146+ messages in thread
From: Florian Weimer @ 2024-07-09  8:52 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Catalin Marinas, Joey Gouly, dave.hansen, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, broonie, christophe.leroy, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

* Szabolcs Nagy:

>> However, does it matter much? That's only for the initial setup, the
>> user can then change the permissions directly via the sysreg. So maybe
>> we don't need all those combinations upfront. A PKEY_DISABLE_EXECUTE
>> together with the full PKEY_DISABLE_ACCESS would probably suffice.
>
> this is ok.
>
> a bit awkward in userspace when the register is directly
> set to e.g write-only and pkey_get has to return something,
> but we can handle settings outside of valid PKEY_* macros
> as unspec, users who want that would use their own register
> set/get code.
>
> i would have designed the permission to use either existing
> PROT_* flags or say that it is architectural and written to
> the register directly and let the libc wrapper deal with
> portable api, i guess it's too late now.

We can still define a portable API if we get a few more PKEY_* bits.
The last attempt stalled because the kernel does not really need them,
it would be for userspace benefit only.

For performance-critical code, pkey_get/pkey_set are already too slow,
so adding a bit more bit twiddling to it wouldn't be a proble, I think.
Applications that want to change protection key bits around a very short
code sequence will have to write the architecture-specific register.

> (the signal handling behaviour should have a control and it
> is possible to fix e.g. via pkey_alloc flags, but that may
> not be the best solution and this can be done later.)

For glibc, the POWER behavior is much more useful.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
  2024-06-21 16:57   ` Catalin Marinas
@ 2024-07-09 13:03   ` Kevin Brodsky
  2024-07-16 10:13   ` Anshuman Khandual
  2024-07-25 15:57   ` Dave Martin
  3 siblings, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-09 13:03 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> [...]
>
> +static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
> +			unsigned int mm_flags)
> +{
> +	unsigned long iss2 = ESR_ELx_ISS2(esr);
> +
> +	if (!arch_pkeys_enabled())
> +		return false;
> +
> +	if (iss2 & ESR_ELx_Overlay)
> +		return true;
> +
> +	return !arch_vma_access_permitted(vma,
> +			mm_flags & FAULT_FLAG_WRITE,
> +			mm_flags & FAULT_FLAG_INSTRUCTION,
> +			mm_flags & FAULT_FLAG_REMOTE);

This function is only called from do_page_fault(), so the access cannot
be remote. The equivalent x86 function (access_error()) always sets
foreign to false.

> +}
> +
>  static vm_fault_t __do_page_fault(struct mm_struct *mm,
>  				  struct vm_area_struct *vma, unsigned long addr,
>  				  unsigned int mm_flags, unsigned long vm_flags,
> @@ -529,6 +547,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
>  	unsigned long addr = untagged_addr(far);
>  	struct vm_area_struct *vma;
> +	bool pkey_fault = false;
> +	int pkey = -1;
>  
>  	if (kprobe_page_fault(regs, esr))
>  		return 0;
> @@ -590,6 +610,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		vma_end_read(vma);
>  		goto lock_mmap;
>  	}
> +
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		vma_end_read(vma);
> +		goto lock_mmap;
> +	}
> +
>  	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>  	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
>  		vma_end_read(vma);
> @@ -617,6 +643,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		goto done;
>  	}
>  
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		pkey_fault = true;
> +		pkey = vma_pkey(vma);
> +	}
> +
>  	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);

We don't actually need to call __do_page_fault()/handle_mm_fault() if
the fault was caused by POE. It still works since it checks
arch_vma_access_permitted() early, but we might as well skip it
altogether (like on x86). On 6.10-rcX, we could handle it like a missing
vm_flags (goto bad_area).

Kevin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
                     ` (2 preceding siblings ...)
  2024-07-05 16:59   ` Catalin Marinas
@ 2024-07-09 13:07   ` Kevin Brodsky
  2024-07-16 11:40     ` Anshuman Khandual
  2024-07-23  4:22   ` Anshuman Khandual
  2024-07-25 16:12   ` Dave Martin
  5 siblings, 1 reply; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-09 13:07 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> @@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
>  	return -1UL >> 8;
>  }
>  
> +/*
> + * We only want to enforce protection keys on the current process
> + * because we effectively have no access to POR_EL0 for other
> + * processes or any way to tell *which * POR_EL0 in a threaded
> + * process we could use.

I see that this comment is essentially copied from x86, but to me it
misses the main point. Even with only one thread in the target process
and a way to obtain its POR_EL0, it still wouldn't make sense to check
that value. If we take the case of a debugger accessing an inferior via
ptrace(), for instance, the kernel is asked to access some memory in
another mm. However, the debugger's POR_EL0 is tied to its own address
space, and the target's POR_EL0 is relevant to its own execution flow
only. In such situations, there is essentially no user context for the
access, so It fundamentally does not make sense to make checks based on
pkey/POE or similar restrictions to memory accesses (e.g. MTE).

Kevin

> + *
> + * So do not enforce things if the VMA is not from the current
> + * mm, or if we are in a kernel thread.
> + */
> +static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +		bool write, bool execute, bool foreign)
> +{
> +	if (!arch_pkeys_enabled())
> +		return true;
> +
> +	/* allow access if the VMA is not one from this process */
> +	if (foreign || vma_is_foreign(vma))
> +		return true;
> +
> +	return por_el0_allows_pkey(vma_pkey(vma), write, execute);
> +}
> +


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
  2024-05-28  6:56   ` Amit Daniel Kachhap
  2024-07-05 17:04   ` Catalin Marinas
@ 2024-07-09 13:08   ` Kevin Brodsky
  2024-07-22  9:16   ` Anshuman Khandual
  2024-07-25 16:00   ` Dave Martin
  4 siblings, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-09 13:08 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
>  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
>  	}
>  
> +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> +		struct poe_context __user *poe_ctx =
> +			apply_user_offset(user, user->poe_offset);
> +
> +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);

Nit: would be nicer to have this in its own helper
(preserve_poe_context()), like for the other optional records.

Kevin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig
  2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
  2024-07-05 17:05   ` Catalin Marinas
@ 2024-07-09 13:08   ` Kevin Brodsky
  2024-07-16 11:02   ` Anshuman Khandual
  2 siblings, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-09 13:08 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> Now that support for POE and Protection Keys has been implemented, add a
> config to allow users to actually enable it.
>
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/Kconfig | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 7b11c98b3e84..676ebe4bf9eb 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2095,6 +2095,28 @@ config ARM64_EPAN
>  	  if the cpu does not implement the feature.
>  endmenu # "ARMv8.7 architectural features"
>  
> +menu "ARMv8.9 architectural features"

Nit: empty line here to be consistent with other menu entries.

Kevin

> +config ARM64_POE
> +	prompt "Permission Overlay Extension"
> +	def_bool y
> +	select ARCH_USES_HIGH_VMA_FLAGS
> +	select ARCH_HAS_PKEYS
> +	help
> +	  The Permission Overlay Extension is used to implement Memory
> +	  Protection Keys. Memory Protection Keys provides a mechanism for
> +	  enforcing page-based protections, but without requiring modification
> +	  of the page tables when an application changes protection domains.
> +
> +	  For details, see Documentation/core-api/protection-keys.rst
> +
> +	  If unsure, say y.
> +
> +config ARCH_PKEY_BITS
> +	int
> +	default 3
> +
> +endmenu # "ARMv8.9 architectural features"
> +
>  config ARM64_SVE
>  	bool "ARM Scalable Vector Extension support"
>  	default y


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records
  2024-05-03 13:01 ` [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records Joey Gouly
  2024-05-29 15:51   ` Mark Brown
@ 2024-07-09 13:10   ` Kevin Brodsky
  1 sibling, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-09 13:10 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> +static uint64_t get_por_el0(void)
> +{
> +	uint64_t val;
> +
> +	asm volatile (
> +		"mrs	%0, " SYS_POR_EL0 "\n"
> +		: "=r"(val)
> +		:
> +		: "cc");

Not sure why we would need "cc" for an MRS? __read_pkey_reg() doesn't
use it (maybe we could directly use that function here if including
pkey-arm64.h is OK).

Kevin

> +
> +	return val;
> +}

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-08 17:53             ` Catalin Marinas
  2024-07-09  8:32               ` Szabolcs Nagy
@ 2024-07-11  9:50               ` Joey Gouly
  2024-07-18 14:45                 ` Szabolcs Nagy
  1 sibling, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-07-11  9:50 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, Florian Weimer, dave.hansen, linux-arm-kernel,
	akpm, aneesh.kumar, aneesh.kumar, bp, broonie, christophe.leroy,
	hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, tglx, will, x86,
	kvmarm

On Mon, Jul 08, 2024 at 06:53:18PM +0100, Catalin Marinas wrote:
> Hi Szabolcs,
> 
> On Mon, Jun 17, 2024 at 03:51:35PM +0100, Szabolcs Nagy wrote:
> > The 06/17/2024 15:40, Florian Weimer wrote:
> > > >> A user can still set it by interacting with the register directly, but I guess
> > > >> we want something for the glibc interface..
> > > >> 
> > > >> Dave, any thoughts here?
> > > >
> > > > adding Florian too, since i found an old thread of his that tried
> > > > to add separate PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE, but
> > > > it did not seem to end up upstream. (this makes more sense to me
> > > > as libc api than the weird disable access semantics)
> > > 
> > > I still think it makes sense to have a full complenent of PKEY_* flags
> > > complementing the PROT_* flags, in a somewhat abstract fashion for
> > > pkey_alloc only.  The internal protection mask register encoding will
> > > differ from architecture to architecture, but the abstract glibc
> > > functions pkey_set and pkey_get could use them (if we are a bit
> > > careful).
> > 
> > to me it makes sense to have abstract
> > 
> > PKEY_DISABLE_READ
> > PKEY_DISABLE_WRITE
> > PKEY_DISABLE_EXECUTE
> > PKEY_DISABLE_ACCESS
> > 
> > where access is handled like
> > 
> > if (flags&PKEY_DISABLE_ACCESS)
> > 	flags |= PKEY_DISABLE_READ|PKEY_DISABLE_WRITE;
> > disable_read = flags&PKEY_DISABLE_READ;
> > disable_write = flags&PKEY_DISABLE_WRITE;
> > disable_exec = flags&PKEY_DISABLE_EXECUTE;
> > 
> > if there are unsupported combinations like
> > disable_read&&!disable_write then those are rejected
> > by pkey_alloc and pkey_set.
> > 
> > this allows portable use of pkey apis.
> > (the flags could be target specific, but don't have to be)
> 
> On powerpc, PKEY_DISABLE_ACCESS also disables execution. AFAICT, the
> kernel doesn't define a PKEY_DISABLE_READ, only PKEY_DISABLE_ACCESS so
> for powerpc there's no way to to set an execute-only permission via this
> interface. I wouldn't like to diverge from powerpc.

I think this is wrong, look at this code from powerpc:

arch/powerpc/mm/book3s64/pkeys.c: __arch_set_user_pkey_access

        if (init_val & PKEY_DISABLE_EXECUTE) {
                if (!pkey_execute_disable_supported)
                        return -EINVAL;
                new_iamr_bits |= IAMR_EX_BIT;
        }
        init_iamr(pkey, new_iamr_bits);

        /* Set the bits we need in AMR: */
        if (init_val & PKEY_DISABLE_ACCESS)
                new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
        else if (init_val & PKEY_DISABLE_WRITE)
                new_amr_bits |= AMR_WR_BIT;

        init_amr(pkey, new_amr_bits);

Seems to me that PKEY_DISABLE_ACCESS leaves exec permissions as-is.

Here is the patch I am planning to include in the next version of the series.
This should support all PKEY_DISABLE_* combinations. Any comments? 

commit ba51371a544f6b0a4a0f03df62ad894d53f5039b
Author: Joey Gouly <joey.gouly@arm.com>
Date:   Thu Jul 4 11:29:20 2024 +0100

    arm64: add PKEY_DISABLE_READ and PKEY_DISABLE_EXEC
    
    TODO
    
    Signed-off-by: Joey Gouly <joey.gouly@arm.com>

diff --git arch/arm64/include/uapi/asm/mman.h arch/arm64/include/uapi/asm/mman.h
index 1e6482a838e1..e7e0c8216243 100644
--- arch/arm64/include/uapi/asm/mman.h
+++ arch/arm64/include/uapi/asm/mman.h
@@ -7,4 +7,13 @@
 #define PROT_BTI       0x10            /* BTI guarded page */
 #define PROT_MTE       0x20            /* Normal Tagged mapping */
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#define PKEY_DISABLE_READ      0x8
+#undef PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
+                               PKEY_DISABLE_WRITE  |\
+                               PKEY_DISABLE_READ   |\
+                               PKEY_DISABLE_EXECUTE)
+
 #endif /* ! _UAPI__ASM_MMAN_H */
diff --git arch/arm64/mm/mmu.c arch/arm64/mm/mmu.c
index 68afe5fc3071..ce4cc6bdee4e 100644
--- arch/arm64/mm/mmu.c
+++ arch/arm64/mm/mmu.c
@@ -1570,10 +1570,15 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long i
                return -EINVAL;
 
        /* Set the bits we need in POR:  */
+       new_por = POE_RXW;
+       if (init_val & PKEY_DISABLE_WRITE)
+               new_por &= ~POE_W;
        if (init_val & PKEY_DISABLE_ACCESS)
-               new_por = POE_X;
-       else if (init_val & PKEY_DISABLE_WRITE)
-               new_por = POE_RX;
+               new_por &= ~POE_RW;
+       if (init_val & PKEY_DISABLE_READ)
+               new_por &= ~POE_R;
+       if (init_val & PKEY_DISABLE_EXECUTE)
+               new_por &= ~POE_X;
 
        /* Shift the bits in to the correct place in POR for pkey: */
        pkey_shift = pkey * POR_BITS_PER_PKEY;



Thanks,
Joey

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2
  2024-05-03 13:01 ` [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2 Joey Gouly
@ 2024-07-15  7:47   ` Anshuman Khandual
  2024-07-25 15:44   ` Dave Martin
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  7:47 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/el2_setup.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index b7afaa026842..df5614be4b70 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -184,12 +184,20 @@
>  .Lset_pie_fgt_\@:
>  	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
>  	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
> -	cbz	x1, .Lset_fgt_\@
> +	cbz	x1, .Lset_poe_fgt_\@
>  
>  	/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
>  	orr	x0, x0, #HFGxTR_EL2_nPIR_EL1
>  	orr	x0, x0, #HFGxTR_EL2_nPIRE0_EL1
>  
> +.Lset_poe_fgt_\@:
> +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
> +	cbz	x1, .Lset_fgt_\@
> +
> +	/* Disable trapping of POR_EL0 */
> +	orr	x0, x0, #HFGxTR_EL2_nPOR_EL0
> +
>  .Lset_fgt_\@:
>  	msr_s	SYS_HFGRTR_EL2, x0
>  	msr_s	SYS_HFGWTR_EL2, x0

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap
  2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
  2024-06-21 16:58   ` Catalin Marinas
  2024-06-21 17:01   ` Catalin Marinas
@ 2024-07-15  7:47   ` Anshuman Khandual
  2 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  7:47 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
> as the boot CPU will enable POE if it has it, so secondary CPUs must also
> have this feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kernel/cpufeature.c | 9 +++++++++
>  arch/arm64/tools/cpucaps       | 1 +
>  2 files changed, 10 insertions(+)
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 56583677c1f2..2f3c2346e156 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2861,6 +2861,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.matches = has_nv1,
>  		ARM64_CPUID_FIELDS_NEG(ID_AA64MMFR4_EL1, E2H0, NI_NV1)
>  	},
> +#ifdef CONFIG_ARM64_POE
> +	{
> +		.desc = "Stage-1 Permission Overlay Extension (S1POE)",
> +		.capability = ARM64_HAS_S1POE,
> +		.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
> +		.matches = has_cpuid_feature,
> +		ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
> +	},
> +#endif
>  	{},
>  };
>  
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 62b2838a231a..45f558fc0d87 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -45,6 +45,7 @@ HAS_MOPS
>  HAS_NESTED_VIRT
>  HAS_PAN
>  HAS_S1PIE
> +HAS_S1POE
>  HAS_RAS_EXTN
>  HAS_RNG
>  HAS_SB

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN
  2024-05-03 13:01 ` [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN Joey Gouly
  2024-05-03 16:41   ` Dave Hansen
@ 2024-07-15  7:53   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  7:53 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Use the new CONFIG_ARCH_PKEY_BITS to simplify setting these bits
> for different architectures.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-mm@kvack.org
> ---
>  fs/proc/task_mmu.c |  2 ++
>  include/linux/mm.h | 16 ++++++++++------
>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 23fbab954c20..0d152f460dcc 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -692,7 +692,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
>  		[ilog2(VM_PKEY_BIT0)]	= "",
>  		[ilog2(VM_PKEY_BIT1)]	= "",
>  		[ilog2(VM_PKEY_BIT2)]	= "",
> +#if VM_PKEY_BIT3
>  		[ilog2(VM_PKEY_BIT3)]	= "",
> +#endif
>  #if VM_PKEY_BIT4
>  		[ilog2(VM_PKEY_BIT4)]	= "",
>  #endif
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b6bdaa18b9e9..5605b938acce 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -329,12 +329,16 @@ extern unsigned int kobjsize(const void *objp);
>  #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
>  
>  #ifdef CONFIG_ARCH_HAS_PKEYS
> -# define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
> -# define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
> -# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
> -# define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> -# define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> -#ifdef CONFIG_PPC
> +# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
> +# define VM_PKEY_BIT0  VM_HIGH_ARCH_0
> +# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
> +# define VM_PKEY_BIT2  VM_HIGH_ARCH_2
> +#if CONFIG_ARCH_PKEY_BITS > 3
> +# define VM_PKEY_BIT3  VM_HIGH_ARCH_3
> +#else
> +# define VM_PKEY_BIT3  0
> +#endif
> +#if CONFIG_ARCH_PKEY_BITS > 4
>  # define VM_PKEY_BIT4  VM_HIGH_ARCH_4
>  #else
>  # define VM_PKEY_BIT4  0

Agree with Dave that this is not very clean but does the job i.e getting
rid of the platform #ifdef which in itself is an improvement.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
  2024-06-21 17:03   ` Catalin Marinas
  2024-06-21 17:07   ` Catalin Marinas
@ 2024-07-15  8:27   ` Anshuman Khandual
  2024-07-16 13:21     ` Mark Brown
  2024-07-18 14:16     ` Joey Gouly
  2024-07-22 13:40   ` Kevin Brodsky
  2024-07-25 15:46   ` Dave Martin
  4 siblings, 2 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  8:27 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> POR_EL0 is a register that can be modified by userspace directly,
> so it must be context switched.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/cpufeature.h |  6 ++++++
>  arch/arm64/include/asm/processor.h  |  1 +
>  arch/arm64/include/asm/sysreg.h     |  3 +++
>  arch/arm64/kernel/process.c         | 28 ++++++++++++++++++++++++++++
>  4 files changed, 38 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 8b904a757bd3..d46aab23e06e 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -832,6 +832,12 @@ static inline bool system_supports_lpa2(void)
>  	return cpus_have_final_cap(ARM64_HAS_LPA2);
>  }
>  
> +static inline bool system_supports_poe(void)
> +{
> +	return IS_ENABLED(CONFIG_ARM64_POE) &&

CONFIG_ARM64_POE has not been defined/added until now ?

> +		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
> +}
> +
>  int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
>  bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>  
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index f77371232d8c..e6376f979273 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -184,6 +184,7 @@ struct thread_struct {
>  	u64			sctlr_user;
>  	u64			svcr;
>  	u64			tpidr2_el0;
> +	u64			por_el0;
>  };

As there going to be a new config i.e CONFIG_ARM64_POE, should not this
register be wrapped up with #ifdef CONFIG_ARM64_POE as well ? Similarly
access into p->thread.por_el0 should also be conditional on that config.

>  
>  static inline unsigned int thread_get_vl(struct thread_struct *thread,
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 9e8999592f3a..62c399811dbf 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -1064,6 +1064,9 @@
>  #define POE_RXW		UL(0x7)
>  #define POE_MASK	UL(0xf)
>  
> +/* Initial value for Permission Overlay Extension for EL0 */
> +#define POR_EL0_INIT	POE_RXW

The idea behind POE_RXW as the init value is to be all permissive ?

> +
>  #define ARM64_FEATURE_FIELD_BITS	4
>  
>  /* Defined for compatibility only, do not add new users. */
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4ae31b7af6c3..0ffaca98bed6 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -271,12 +271,23 @@ static void flush_tagged_addr_state(void)
>  		clear_thread_flag(TIF_TAGGED_ADDR);
>  }
>  
> +static void flush_poe(void)
> +{
> +	if (!system_supports_poe())
> +		return;
> +
> +	write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> +	/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> +	isb();
> +}
> +
>  void flush_thread(void)
>  {
>  	fpsimd_flush_thread();
>  	tls_thread_flush();
>  	flush_ptrace_hw_breakpoint(current);
>  	flush_tagged_addr_state();
> +	flush_poe();
>  }
>  
>  void arch_release_task_struct(struct task_struct *tsk)
> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>  		if (system_supports_tpidr2())
>  			p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
>  
> +		if (system_supports_poe())
> +			p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> +
>  		if (stack_start) {
>  			if (is_compat_thread(task_thread_info(p)))
>  				childregs->compat_sp = stack_start;
> @@ -495,6 +509,19 @@ static void erratum_1418040_new_exec(void)
>  	preempt_enable();
>  }
>  
> +static void permission_overlay_switch(struct task_struct *next)
> +{
> +	if (!system_supports_poe())
> +		return;
> +
> +	current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> +	if (current->thread.por_el0 != next->thread.por_el0) {
> +		write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
> +		/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> +		isb();
> +	}
> +}
> +
>  /*
>   * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore
>   * this function must be called with preemption disabled and the update to
> @@ -530,6 +557,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  	ssbs_thread_switch(next);
>  	erratum_1418040_thread_switch(next);
>  	ptrauth_thread_switch_user(next);
> +	permission_overlay_switch(next);
>  
>  	/*
>  	 * Complete any pending TLB or cache maintenance on this CPU in case

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_*
  2024-05-03 13:01 ` [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_* Joey Gouly
  2024-05-29 15:46   ` Marc Zyngier
@ 2024-07-15  8:36   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  8:36 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> To allow using newer instructions that current assemblers don't know about,
> replace the `at` instruction with the underlying SYS instruction.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_asm.h       | 3 ++-
>  arch/arm64/kvm/hyp/include/hyp/fault.h | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 24b5e6b23417..ce65fd0f01b0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -10,6 +10,7 @@
>  #include <asm/hyp_image.h>
>  #include <asm/insn.h>
>  #include <asm/virt.h>
> +#include <asm/sysreg.h>
>  
>  #define ARM_EXIT_WITH_SERROR_BIT  31
>  #define ARM_EXCEPTION_CODE(x)	  ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
> @@ -261,7 +262,7 @@ extern u64 __kvm_get_mdcr_el2(void);
>  	asm volatile(							\
>  	"	mrs	%1, spsr_el2\n"					\
>  	"	mrs	%2, elr_el2\n"					\
> -	"1:	at	"at_op", %3\n"					\
> +	"1:	" __msr_s(at_op, "%3") "\n"				\
>  	"	isb\n"							\
>  	"	b	9f\n"						\
>  	"2:	msr	spsr_el2, %1\n"					\
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> index 9e13c1bc2ad5..487c06099d6f 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/fault.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -27,7 +27,7 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  	 * saved the guest context yet, and we may return early...
>  	 */
>  	par = read_sysreg_par();
> -	if (!__kvm_at("s1e1r", far))
> +	if (!__kvm_at(OP_AT_S1E1R, far))
>  		tmp = read_sysreg_par();
>  	else
>  		tmp = SYS_PAR_EL1_F; /* back to the guest */

I guess this patch has already been included in a different series now.

https://lore.kernel.org/all/20240625133508.259829-6-maz@kernel.org/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE
  2024-05-03 13:01 ` [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE Joey Gouly
  2024-05-29 15:50   ` Marc Zyngier
@ 2024-07-15  8:45   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  8:45 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> FEAT_ATS1E1A introduces a new instruction: `at s1e1a`.
> This is an address translation, without permission checks.
> 
> POE allows read permissions to be removed from S1 by the guest.  This means
> that an `at` instruction could fail, and not get the IPA.
> 
> Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1
> permissions.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/hyp/include/hyp/fault.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> index 487c06099d6f..17df94570f03 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/fault.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -14,6 +14,7 @@
>  
>  static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  {
> +	int ret;
>  	u64 par, tmp;
>  
>  	/*
> @@ -27,7 +28,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
>  	 * saved the guest context yet, and we may return early...
>  	 */
>  	par = read_sysreg_par();
> -	if (!__kvm_at(OP_AT_S1E1R, far))
> +	ret = system_supports_poe() ? __kvm_at(OP_AT_S1E1A, far) :
> +	                              __kvm_at(OP_AT_S1E1R, far);
> +	if (!ret)
>  		tmp = read_sysreg_par();
>  	else
>  		tmp = SYS_PAR_EL1_F; /* back to the guest */

Since the idea is to get the IPA, using OP_AT_S1E1A instead, makes sense
when POE is enabled.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
  2024-06-21 17:04   ` Catalin Marinas
@ 2024-07-15  9:13   ` Anshuman Khandual
  2024-07-15 20:16   ` Mark Brown
  2024-07-25 15:49   ` Dave Martin
  3 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  9:13 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
> check if the CPU supports the feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> This takes the last bit of HWCAP2, is this fine? What can we do about more features in the future?
> 
> 
>  Documentation/arch/arm64/elf_hwcaps.rst |  2 ++
>  arch/arm64/include/asm/hwcap.h          |  1 +
>  arch/arm64/include/uapi/asm/hwcap.h     |  1 +
>  arch/arm64/kernel/cpufeature.c          | 14 ++++++++++++++
>  arch/arm64/kernel/cpuinfo.c             |  1 +
>  5 files changed, 19 insertions(+)
> 
> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> index 448c1664879b..694f67fa07d1 100644
> --- a/Documentation/arch/arm64/elf_hwcaps.rst
> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> @@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
>  HWCAP2_SME_SF8DP4
>      Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
>  
> +HWCAP2_POE
> +    Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
>  
>  4. Unused AT_HWCAP bits
>  -----------------------
> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> index 4edd3b61df11..a775adddecf2 100644
> --- a/arch/arm64/include/asm/hwcap.h
> +++ b/arch/arm64/include/asm/hwcap.h
> @@ -157,6 +157,7 @@
>  #define KERNEL_HWCAP_SME_SF8FMA		__khwcap2_feature(SME_SF8FMA)
>  #define KERNEL_HWCAP_SME_SF8DP4		__khwcap2_feature(SME_SF8DP4)
>  #define KERNEL_HWCAP_SME_SF8DP2		__khwcap2_feature(SME_SF8DP2)
> +#define KERNEL_HWCAP_POE		__khwcap2_feature(POE)
>  
>  /*
>   * This yields a mask that user programs can use to figure out what
> diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
> index 285610e626f5..055381b2c615 100644
> --- a/arch/arm64/include/uapi/asm/hwcap.h
> +++ b/arch/arm64/include/uapi/asm/hwcap.h
> @@ -122,5 +122,6 @@
>  #define HWCAP2_SME_SF8FMA	(1UL << 60)
>  #define HWCAP2_SME_SF8DP4	(1UL << 61)
>  #define HWCAP2_SME_SF8DP2	(1UL << 62)
> +#define HWCAP2_POE		(1UL << 63)
>  
>  #endif /* _UAPI__ASM_HWCAP_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 2f3c2346e156..8c02aae9db11 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -465,6 +465,8 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
>  };
>  
>  static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = {
> +	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_POE),
> +		       FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0),
>  	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0),
>  	ARM64_FTR_END,
> @@ -2339,6 +2341,14 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused)
>  	sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn);
>  }
>  
> +#ifdef CONFIG_ARM64_POE
> +static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
> +{
> +	sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
> +	sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
> +}
> +#endif
> +
>  /* Internal helper functions to match cpu capability type */
>  static bool
>  cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
> @@ -2867,6 +2877,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.capability = ARM64_HAS_S1POE,
>  		.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
>  		.matches = has_cpuid_feature,
> +		.cpu_enable = cpu_enable_poe,
>  		ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
>  	},
>  #endif
> @@ -3034,6 +3045,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
>  	HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2),
>  	HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3),
>  	HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2),
> +#ifdef CONFIG_ARM64_POE
> +	HWCAP_CAP(ID_AA64MMFR3_EL1, S1POE, IMP, CAP_HWCAP, KERNEL_HWCAP_POE),
> +#endif
>  	{},
>  };
>  
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index 09eeaa24d456..b9db812082b3 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -143,6 +143,7 @@ static const char *const hwcap_str[] = {
>  	[KERNEL_HWCAP_SME_SF8FMA]	= "smesf8fma",
>  	[KERNEL_HWCAP_SME_SF8DP4]	= "smesf8dp4",
>  	[KERNEL_HWCAP_SME_SF8DP2]	= "smesf8dp2",
> +	[KERNEL_HWCAP_POE]		= "poe",
>  };
>  
>  #ifdef CONFIG_COMPAT

This LGTM but as Joey mentioned earlier, what happens when another new
feature gets added later which needs to be exposed to userspace, add
HWCAP3 ?

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 11/29] arm64: re-order MTE VM_ flags
  2024-05-03 13:01 ` [PATCH v4 11/29] arm64: re-order MTE VM_ flags Joey Gouly
  2024-06-21 17:04   ` Catalin Marinas
@ 2024-07-15  9:21   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  9:21 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm


On 5/3/24 18:31, Joey Gouly wrote:
> To make it easier to share the generic PKEYs flags, move the MTE flag.

The change looks good but too less details about it here. Please do consider
adding some more description, on how moving the VM flags down the arch range
helps going forward.

> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  include/linux/mm.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5605b938acce..2065727b3787 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -377,8 +377,8 @@ extern unsigned int kobjsize(const void *objp);
>  #endif
>  
>  #if defined(CONFIG_ARM64_MTE)
> -# define VM_MTE		VM_HIGH_ARCH_0	/* Use Tagged memory for access control */
> -# define VM_MTE_ALLOWED	VM_HIGH_ARCH_1	/* Tagged memory permitted */
> +# define VM_MTE		VM_HIGH_ARCH_4	/* Use Tagged memory for access control */
> +# define VM_MTE_ALLOWED	VM_HIGH_ARCH_5	/* Tagged memory permitted */
>  #else
>  # define VM_MTE		VM_NONE
>  # define VM_MTE_ALLOWED	VM_NONE

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 12/29] arm64: add POIndex defines
  2024-05-03 13:01 ` [PATCH v4 12/29] arm64: add POIndex defines Joey Gouly
  2024-06-21 17:05   ` Catalin Marinas
@ 2024-07-15  9:26   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-15  9:26 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 5/3/24 18:31, Joey Gouly wrote:
> The 3-bit POIndex is stored in the PTE at bits 60..62.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/pgtable-hwdef.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index ef207a0d4f0d..370a02922fe1 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -198,6 +198,16 @@
>  #define PTE_PI_IDX_2	53	/* PXN */
>  #define PTE_PI_IDX_3	54	/* UXN */
>  
> +/*
> + * POIndex[2:0] encoding (Permission Overlay Extension)
> + */
> +#define PTE_PO_IDX_0	(_AT(pteval_t, 1) << 60)
> +#define PTE_PO_IDX_1	(_AT(pteval_t, 1) << 61)
> +#define PTE_PO_IDX_2	(_AT(pteval_t, 1) << 62)
> +
> +#define PTE_PO_IDX_MASK		GENMASK_ULL(62, 60)
> +
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */

Could this patch be folded with a later patch that uses the above indices
and the mask for the first time.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
  2024-06-21 17:04   ` Catalin Marinas
  2024-07-15  9:13   ` Anshuman Khandual
@ 2024-07-15 20:16   ` Mark Brown
  2024-07-25 15:49   ` Dave Martin
  3 siblings, 0 replies; 146+ messages in thread
From: Mark Brown @ 2024-07-15 20:16 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 234 bytes --]

On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote:

> This takes the last bit of HWCAP2, is this fine? What can we do about
> more features in the future?

HWCAP3 has already been allocated so we could just start using that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
  2024-05-28  6:54   ` Amit Daniel Kachhap
@ 2024-07-16  9:05   ` Anshuman Khandual
  2024-07-16  9:34     ` Joey Gouly
  2024-07-25 15:49   ` Dave Martin
  2 siblings, 1 reply; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16  9:05 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> value is set in the vm_flags and then into the pgprot value.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/mman.h | 8 +++++++-
>  arch/arm64/mm/mmap.c          | 9 +++++++++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> index 5966ee4a6154..ecb2d18dc4d7 100644
> --- a/arch/arm64/include/asm/mman.h
> +++ b/arch/arm64/include/asm/mman.h
> @@ -7,7 +7,7 @@
>  #include <uapi/asm/mman.h>
>  
>  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> -	unsigned long pkey __always_unused)
> +	unsigned long pkey)
>  {
>  	unsigned long ret = 0;
>  
> @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>  	if (system_supports_mte() && (prot & PROT_MTE))
>  		ret |= VM_MTE;
>  
> +#if defined(CONFIG_ARCH_HAS_PKEYS)
> +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;

0x1, 0x2, 0x4 here are standard bit positions for their corresponding
VM_KEY_XXX based protection values ? Although this is similar to what
x86 is doing currently, hence just trying to understand if these bit
positions here are related to the user visible ABI, which should be
standardized ?

Agree with previous comments about the need for system_supports_poe()
based additional check for the above code block.

> +#endif
> +
>  	return ret;
>  }
>  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> index 642bdf908b22..86eda6bc7893 100644
> --- a/arch/arm64/mm/mmap.c
> +++ b/arch/arm64/mm/mmap.c
> @@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
>  	if (vm_flags & VM_MTE)
>  		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
>  
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +	if (vm_flags & VM_PKEY_BIT0)
> +		prot |= PTE_PO_IDX_0;
> +	if (vm_flags & VM_PKEY_BIT1)
> +		prot |= PTE_PO_IDX_1;
> +	if (vm_flags & VM_PKEY_BIT2)
> +		prot |= PTE_PO_IDX_2;
> +#endif
> +
>  	return __pgprot(prot);
>  }
>  EXPORT_SYMBOL(vm_get_page_prot);

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE
  2024-05-03 13:01 ` [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE Joey Gouly
@ 2024-07-16  9:10   ` Anshuman Khandual
  0 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16  9:10 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> When a PTE is modified, the POIndex must be masked off so that it can be modified.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/include/asm/pgtable.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index afdd56d26ad7..5c970a9cca67 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1028,7 +1028,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  	 */
>  	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
>  			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP |
> -			      PTE_ATTRINDX_MASK;
> +			      PTE_ATTRINDX_MASK | PTE_PO_IDX_MASK;
> +
>  	/* preserve the hardware dirty information */
>  	if (pte_hw_dirty(pte))
>  		pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-07-16  9:05   ` Anshuman Khandual
@ 2024-07-16  9:34     ` Joey Gouly
  0 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-07-16  9:34 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Jul 16, 2024 at 02:35:48PM +0530, Anshuman Khandual wrote:
> 
> 
> On 5/3/24 18:31, Joey Gouly wrote:
> > Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> > value is set in the vm_flags and then into the pgprot value.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/mman.h | 8 +++++++-
> >  arch/arm64/mm/mmap.c          | 9 +++++++++
> >  2 files changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > index 5966ee4a6154..ecb2d18dc4d7 100644
> > --- a/arch/arm64/include/asm/mman.h
> > +++ b/arch/arm64/include/asm/mman.h
> > @@ -7,7 +7,7 @@
> >  #include <uapi/asm/mman.h>
> >  
> >  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > -	unsigned long pkey __always_unused)
> > +	unsigned long pkey)
> >  {
> >  	unsigned long ret = 0;
> >  
> > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >  	if (system_supports_mte() && (prot & PROT_MTE))
> >  		ret |= VM_MTE;
> >  
> > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> > +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> > +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> > +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> 
> 0x1, 0x2, 0x4 here are standard bit positions for their corresponding
> VM_KEY_XXX based protection values ? Although this is similar to what
> x86 is doing currently, hence just trying to understand if these bit
> positions here are related to the user visible ABI, which should be
> standardized ?

The bit positions of VM_PKEY_BIT* aren't user visible. This is converting the
value of the `pkey` that was passed to the mprotect, into the internal flags.

I might replace those hex values with BIT(0), BIT(1), BIT(2), might be clearer.

> 
> Agree with previous comments about the need for system_supports_poe()
> based additional check for the above code block.
> 
> > +#endif
> > +
> >  	return ret;
> >  }
> >  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> > diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> > index 642bdf908b22..86eda6bc7893 100644
> > --- a/arch/arm64/mm/mmap.c
> > +++ b/arch/arm64/mm/mmap.c
> > @@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
> >  	if (vm_flags & VM_MTE)
> >  		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
> >  
> > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > +	if (vm_flags & VM_PKEY_BIT0)
> > +		prot |= PTE_PO_IDX_0;
> > +	if (vm_flags & VM_PKEY_BIT1)
> > +		prot |= PTE_PO_IDX_1;
> > +	if (vm_flags & VM_PKEY_BIT2)
> > +		prot |= PTE_PO_IDX_2;
> > +#endif
> > +
> >  	return __pgprot(prot);
> >  }
> >  EXPORT_SYMBOL(vm_get_page_prot);
> 

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
  2024-06-21 16:57   ` Catalin Marinas
  2024-07-09 13:03   ` Kevin Brodsky
@ 2024-07-16 10:13   ` Anshuman Khandual
  2024-07-25 15:57   ` Dave Martin
  3 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 10:13 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

A minor nit. The fault is related to POE in terms of the HW rather than PKEY
which it is abstracted out in core MM. Hence it might be better to describe
the fault as POE one rather than PKEY related.

arm64/mm: Handle POE faults

On 5/3/24 18:31, Joey Gouly wrote:
> If a memory fault occurs that is due to an overlay/pkey fault, report that to
> userspace with a SEGV_PKUERR.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/traps.h |  1 +
>  arch/arm64/kernel/traps.c      | 12 ++++++--
>  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
>  3 files changed, 64 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> index eefe766d6161..f6f6f2cb7f10 100644
> --- a/arch/arm64/include/asm/traps.h
> +++ b/arch/arm64/include/asm/traps.h
> @@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
>  void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
>  void arm64_notify_segfault(unsigned long addr);
>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
>  
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 215e6d7f2df8..1bac6c84d3f5 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
>  	__show_regs(regs);
>  }
>  
> -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> -			   const char *str)
> +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> +			   const char *str, int pkey)
>  {
>  	arm64_show_signal(signo, str);
>  	if (signo == SIGKILL)
>  		force_sig(SIGKILL);
> +	else if (code == SEGV_PKUERR)
> +		force_sig_pkuerr((void __user *)far, pkey);
>  	else
>  		force_sig_fault(signo, code, (void __user *)far);
>  }
>  
> +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> +			   const char *str)
> +{
> +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
> +}
> +

arm64_force_sig_fault_pkey() could not be added as a new stand alone
helper, without refactoring with arm64_force_sig_fault() ? Is there
any benefit ?

>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
>  			    const char *str)
>  {
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 8251e2fea9c7..585295168918 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched/debug.h>
>  #include <linux/highmem.h>
>  #include <linux/perf_event.h>
> +#include <linux/pkeys.h>
>  #include <linux/preempt.h>
>  #include <linux/hugetlb.h>
>  
> @@ -489,6 +490,23 @@ static void do_bad_area(unsigned long far, unsigned long esr,
>  #define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
>  #define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
>  
> +static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
> +			unsigned int mm_flags)
> +{
> +	unsigned long iss2 = ESR_ELx_ISS2(esr);
> +
> +	if (!arch_pkeys_enabled())
> +		return false;
> +
> +	if (iss2 & ESR_ELx_Overlay)
> +		return true;

Should not there be a spurious POE fault check with an WARN_ONCE()
splash, when ESR_ELx_Overlay is set without arch_pkeys_enabled().

> +
> +	return !arch_vma_access_permitted(vma,
> +			mm_flags & FAULT_FLAG_WRITE,
> +			mm_flags & FAULT_FLAG_INSTRUCTION,
> +			mm_flags & FAULT_FLAG_REMOTE);
> +}

FAULT_FLAG_REMOTE is applicable here ?

> +
>  static vm_fault_t __do_page_fault(struct mm_struct *mm,
>  				  struct vm_area_struct *vma, unsigned long addr,
>  				  unsigned int mm_flags, unsigned long vm_flags,
> @@ -529,6 +547,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
>  	unsigned long addr = untagged_addr(far);
>  	struct vm_area_struct *vma;
> +	bool pkey_fault = false;
> +	int pkey = -1;
>  
>  	if (kprobe_page_fault(regs, esr))
>  		return 0;
> @@ -590,6 +610,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		vma_end_read(vma);
>  		goto lock_mmap;
>  	}
> +
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		vma_end_read(vma);
> +		goto lock_mmap;
> +	}
> +
>  	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>  	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
>  		vma_end_read(vma);
> @@ -617,6 +643,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		goto done;
>  	}
>  
> +	if (fault_from_pkey(esr, vma, mm_flags)) {
> +		pkey_fault = true;
> +		pkey = vma_pkey(vma);
> +	}
> +
>  	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
>  
>  	/* Quick path to respond to signals */
> @@ -682,9 +713,28 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		 * Something tried to access memory that isn't in our memory
>  		 * map.
>  		 */
> -		arm64_force_sig_fault(SIGSEGV,
> -				      fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
> -				      far, inf->name);
> +		int fault_kind;
> +		/*
> +		 * The pkey value that we return to userspace can be different
> +		 * from the pkey that caused the fault.
> +		 *
> +		 * 1. T1   : mprotect_key(foo, PAGE_SIZE, pkey=4);
> +		 * 2. T1   : set POR_EL0 to deny access to pkey=4, touches, page
> +		 * 3. T1   : faults...
> +		 * 4.    T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
> +		 * 5. T1   : enters fault handler, takes mmap_lock, etc...
> +		 * 6. T1   : reaches here, sees vma_pkey(vma)=5, when we really
> +		 *	     faulted on a pte with its pkey=4.
> +		 */
> +
> +		if (pkey_fault)
> +			fault_kind = SEGV_PKUERR;
> +		else
> +			fault_kind = fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR;
> +
> +		arm64_force_sig_fault_pkey(SIGSEGV,
> +				      fault_kind,
> +				      far, inf->name, pkey);
>  	}
>  
>  	return 0;

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay()
  2024-05-03 13:01 ` [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay() Joey Gouly
  2024-06-21 17:15   ` Catalin Marinas
@ 2024-07-16 10:21   ` Anshuman Khandual
  1 sibling, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 10:21 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> We do not want take POE into account when clearing the MTE tags.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/include/asm/pgtable.h | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 5c970a9cca67..2449e4e27ea6 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -160,8 +160,10 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>   * not set) must return false. PROT_NONE mappings do not have the
>   * PTE_VALID bit set.
>   */
> -#define pte_access_permitted(pte, write) \
> +#define pte_access_permitted_no_overlay(pte, write) \
>  	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
> +#define pte_access_permitted(pte, write) \
> +	pte_access_permitted_no_overlay(pte, write)
>  #define pmd_access_permitted(pmd, write) \
>  	(pte_access_permitted(pmd_pte(pmd), (write)))
>  #define pud_access_permitted(pud, write) \
> @@ -348,10 +350,11 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages)
>  	/*
>  	 * If the PTE would provide user space access to the tags associated
>  	 * with it then ensure that the MTE tags are synchronised.  Although
> -	 * pte_access_permitted() returns false for exec only mappings, they
> -	 * don't expose tags (instruction fetches don't check tags).
> +	 * pte_access_permitted_no_overlay() returns false for exec only
> +	 * mappings, they don't expose tags (instruction fetches don't check
> +	 * tags).
>  	 */
> -	if (system_supports_mte() && pte_access_permitted(pte, false) &&
> +	if (system_supports_mte() && pte_access_permitted_no_overlay(pte, false) &&
>  	    !pte_special(pte) && pte_tagged(pte))
>  		mte_sync_tags(pte, nr_pages);
>  }

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE
  2024-05-03 13:01 ` [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE Joey Gouly
@ 2024-07-16 10:35   ` Anshuman Khandual
  0 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 10:35 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Add a regset for POE containing POR_EL0.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/kernel/ptrace.c | 46 ++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/elf.h   |  1 +
>  2 files changed, 47 insertions(+)
> 
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 0d022599eb61..b756578aeaee 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1440,6 +1440,39 @@ static int tagged_addr_ctrl_set(struct task_struct *target, const struct
>  }
>  #endif
>  
> +#ifdef CONFIG_ARM64_POE
> +static int poe_get(struct task_struct *target,
> +		   const struct user_regset *regset,
> +		   struct membuf to)
> +{
> +	if (!system_supports_poe())
> +		return -EINVAL;
> +
> +	return membuf_write(&to, &target->thread.por_el0,
> +			    sizeof(target->thread.por_el0));
> +}
> +
> +static int poe_set(struct task_struct *target, const struct
> +		   user_regset *regset, unsigned int pos,
> +		   unsigned int count, const void *kbuf, const
> +		   void __user *ubuf)
> +{
> +	int ret;
> +	long ctrl;
> +
> +	if (!system_supports_poe())
> +		return -EINVAL;
> +
> +	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1);
> +	if (ret)
> +		return ret;
> +
> +	target->thread.por_el0 = ctrl;
> +
> +	return 0;
> +}
> +#endif
> +
>  enum aarch64_regset {
>  	REGSET_GPR,
>  	REGSET_FPR,
> @@ -1469,6 +1502,9 @@ enum aarch64_regset {
>  #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
>  	REGSET_TAGGED_ADDR_CTRL,
>  #endif
> +#ifdef CONFIG_ARM64_POE
> +	REGSET_POE
> +#endif
>  };
>  
>  static const struct user_regset aarch64_regsets[] = {
> @@ -1628,6 +1664,16 @@ static const struct user_regset aarch64_regsets[] = {
>  		.set = tagged_addr_ctrl_set,
>  	},
>  #endif
> +#ifdef CONFIG_ARM64_POE
> +	[REGSET_POE] = {
> +		.core_note_type = NT_ARM_POE,
> +		.n = 1,
> +		.size = sizeof(long),
> +		.align = sizeof(long),
> +		.regset_get = poe_get,
> +		.set = poe_set,
> +	},
> +#endif
>  };
>  
>  static const struct user_regset_view user_aarch64_view = {
> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> index b54b313bcf07..81762ff3c99e 100644
> --- a/include/uapi/linux/elf.h
> +++ b/include/uapi/linux/elf.h
> @@ -441,6 +441,7 @@ typedef struct elf64_shdr {
>  #define NT_ARM_ZA	0x40c		/* ARM SME ZA registers */
>  #define NT_ARM_ZT	0x40d		/* ARM SME ZT registers */
>  #define NT_ARM_FPMR	0x40e		/* ARM floating point mode register */
> +#define NT_ARM_POE	0x40f		/* ARM POE registers */
>  #define NT_ARC_V2	0x600		/* ARCv2 accumulator/extra registers */
>  #define NT_VMCOREDD	0x700		/* Vmcore Device Dump Note */
>  #define NT_MIPS_DSP	0x800		/* MIPS DSP ASE registers */

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 20/29] arm64: enable POE and PIE to coexist
  2024-05-03 13:01 ` [PATCH v4 20/29] arm64: enable POE and PIE to coexist Joey Gouly
  2024-06-21 17:16   ` Catalin Marinas
@ 2024-07-16 10:41   ` Anshuman Khandual
  2024-07-16 13:46     ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 10:41 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Set the EL0/userspace indirection encodings to be the overlay enabled
> variants of the permissions.

Could you please explain the rationale for this ? Should POE variants for
pte permissions be used (when available) instead of permission indirection
ones.

> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/pgtable-prot.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index dd9ee67d1d87..4f9f85437d3d 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -147,10 +147,10 @@ static inline bool __pure lpa2_is_enabled(void)
>  
>  #define PIE_E0	( \
>  	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_X_O) | \
> -	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX)  | \
> -	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX) | \
> -	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R)   | \
> -	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW))
> +	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O)  | \
> +	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX_O) | \
> +	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R_O)   | \
> +	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW_O))
>  
>  #define PIE_E1	( \
>  	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_NONE_O) | \

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE
  2024-05-03 13:01 ` [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE Joey Gouly
@ 2024-07-16 10:47   ` Anshuman Khandual
  2024-07-25 15:48     ` Dave Martin
  2024-07-25 16:00   ` Dave Martin
  1 sibling, 1 reply; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 10:47 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Now that PKEYs support has been implemented, enable it for CPUs that
> support S1POE.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/include/asm/pkeys.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> index a284508a4d02..3ea928ec94c0 100644
> --- a/arch/arm64/include/asm/pkeys.h
> +++ b/arch/arm64/include/asm/pkeys.h
> @@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
>  
>  static inline bool arch_pkeys_enabled(void)
>  {
> -	return false;
> +	return system_supports_poe();
>  }
>  
>  static inline int vma_pkey(struct vm_area_struct *vma)

Small nit. Would it better to be consistently using system_supports_poe()
helper rather than arch_pkeys_enabled() inside arch/arm64/ platform code
like - during POE fault handling i.e inside fault_from_pkey().

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig
  2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
  2024-07-05 17:05   ` Catalin Marinas
  2024-07-09 13:08   ` Kevin Brodsky
@ 2024-07-16 11:02   ` Anshuman Khandual
  2 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 11:02 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Now that support for POE and Protection Keys has been implemented, add a
> config to allow users to actually enable it.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/Kconfig | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 7b11c98b3e84..676ebe4bf9eb 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2095,6 +2095,28 @@ config ARM64_EPAN
>  	  if the cpu does not implement the feature.
>  endmenu # "ARMv8.7 architectural features"
>  
> +menu "ARMv8.9 architectural features"

Agree with Kevin regarding need for an empty line here.

> +config ARM64_POE
> +	prompt "Permission Overlay Extension"
> +	def_bool y
> +	select ARCH_USES_HIGH_VMA_FLAGS
> +	select ARCH_HAS_PKEYS
> +	help
> +	  The Permission Overlay Extension is used to implement Memory
> +	  Protection Keys. Memory Protection Keys provides a mechanism for
> +	  enforcing page-based protections, but without requiring modification
> +	  of the page tables when an application changes protection domains.
> +
> +	  For details, see Documentation/core-api/protection-keys.rst
> +
> +	  If unsure, say y.
> +
> +config ARCH_PKEY_BITS
> +	int
> +	default 3
> +
> +endmenu # "ARMv8.9 architectural features"
> +
>  config ARM64_SVE
>  	bool "ARM Scalable Vector Extension support"
>  	default y

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-09 13:07   ` Kevin Brodsky
@ 2024-07-16 11:40     ` Anshuman Khandual
  0 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-16 11:40 UTC (permalink / raw)
  To: Kevin Brodsky, Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 7/9/24 18:37, Kevin Brodsky wrote:
> On 03/05/2024 15:01, Joey Gouly wrote:
>> @@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
>>  	return -1UL >> 8;
>>  }
>>  
>> +/*
>> + * We only want to enforce protection keys on the current process
>> + * because we effectively have no access to POR_EL0 for other
>> + * processes or any way to tell *which * POR_EL0 in a threaded
>> + * process we could use.
> 
> I see that this comment is essentially copied from x86, but to me it
> misses the main point. Even with only one thread in the target process
> and a way to obtain its POR_EL0, it still wouldn't make sense to check
> that value. If we take the case of a debugger accessing an inferior via
> ptrace(), for instance, the kernel is asked to access some memory in
> another mm. However, the debugger's POR_EL0 is tied to its own address
> space, and the target's POR_EL0 is relevant to its own execution flow
> only. In such situations, there is essentially no user context for the
> access, so It fundamentally does not make sense to make checks based on
> pkey/POE or similar restrictions to memory accesses (e.g. MTE).

Indeed this makes more sense. There is no memory context even if there is
access to another POR_EL0. The comment above could be improved describing
this limitation.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-07-15  8:27   ` Anshuman Khandual
@ 2024-07-16 13:21     ` Mark Brown
  2024-07-18 14:16     ` Joey Gouly
  1 sibling, 0 replies; 146+ messages in thread
From: Mark Brown @ 2024-07-16 13:21 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

On Mon, Jul 15, 2024 at 01:57:10PM +0530, Anshuman Khandual wrote:
> On 5/3/24 18:31, Joey Gouly wrote:

> > +static inline bool system_supports_poe(void)
> > +{
> > +	return IS_ENABLED(CONFIG_ARM64_POE) &&

> CONFIG_ARM64_POE has not been defined/added until now ?

That's a common pattern when adding a new feature over a multi-patch
series - add sections guarded with the Kconfig option for the new
feature but which can't be enabled until the last patch of the series
which adds the Kconfig option after the support is complete.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 20/29] arm64: enable POE and PIE to coexist
  2024-07-16 10:41   ` Anshuman Khandual
@ 2024-07-16 13:46     ` Joey Gouly
  0 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-07-16 13:46 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Jul 16, 2024 at 04:11:54PM +0530, Anshuman Khandual wrote:
> 
> 
> On 5/3/24 18:31, Joey Gouly wrote:
> > Set the EL0/userspace indirection encodings to be the overlay enabled
> > variants of the permissions.
> 
> Could you please explain the rationale for this ? Should POE variants for
> pte permissions be used (when available) instead of permission indirection
> ones.

POE and PIE can be enabled independently. When PIE is disabled, the POE is
applied on top of the permissions described in the PTE.
If PIE is enabled, then POE is applied on top of the indirect permissions.
However, the indirect permissions have the ability to control whether POE
actually applies or not. So this change makes POE apply if PIE is enabled or
not.

For example:
	Encoding of POE_EL0
	0001 	Read, Overlay applied
	...
	1000	Read, Overlay not applied. 


I will add something to the commit message.

> 
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/pgtable-prot.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> > index dd9ee67d1d87..4f9f85437d3d 100644
> > --- a/arch/arm64/include/asm/pgtable-prot.h
> > +++ b/arch/arm64/include/asm/pgtable-prot.h
> > @@ -147,10 +147,10 @@ static inline bool __pure lpa2_is_enabled(void)
> >  
> >  #define PIE_E0	( \
> >  	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_X_O) | \
> > -	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX)  | \
> > -	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX) | \
> > -	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R)   | \
> > -	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW))
> > +	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O)  | \
> > +	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX_O) | \
> > +	PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),      PIE_R_O)   | \
> > +	PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),        PIE_RW_O))
> >  
> >  #define PIE_E1	( \
> >  	PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),      PIE_NONE_O) | \
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-07-15  8:27   ` Anshuman Khandual
  2024-07-16 13:21     ` Mark Brown
@ 2024-07-18 14:16     ` Joey Gouly
  1 sibling, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-07-18 14:16 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Mon, Jul 15, 2024 at 01:57:10PM +0530, Anshuman Khandual wrote:
> 
> 
> On 5/3/24 18:31, Joey Gouly wrote:
> > POR_EL0 is a register that can be modified by userspace directly,
> > so it must be context switched.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/cpufeature.h |  6 ++++++
> >  arch/arm64/include/asm/processor.h  |  1 +
> >  arch/arm64/include/asm/sysreg.h     |  3 +++
> >  arch/arm64/kernel/process.c         | 28 ++++++++++++++++++++++++++++
> >  4 files changed, 38 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index 8b904a757bd3..d46aab23e06e 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -832,6 +832,12 @@ static inline bool system_supports_lpa2(void)
> >  	return cpus_have_final_cap(ARM64_HAS_LPA2);
> >  }
> >  
> > +static inline bool system_supports_poe(void)
> > +{
> > +	return IS_ENABLED(CONFIG_ARM64_POE) &&
> 
> CONFIG_ARM64_POE has not been defined/added until now ?
> 
> > +		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
> > +}
> > +
> >  int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
> >  bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
> >  
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index f77371232d8c..e6376f979273 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -184,6 +184,7 @@ struct thread_struct {
> >  	u64			sctlr_user;
> >  	u64			svcr;
> >  	u64			tpidr2_el0;
> > +	u64			por_el0;
> >  };
> 
> As there going to be a new config i.e CONFIG_ARM64_POE, should not this
> register be wrapped up with #ifdef CONFIG_ARM64_POE as well ? Similarly
> access into p->thread.por_el0 should also be conditional on that config.

It seems like we're a bit inconsistent here, for example tpidr2_el0 from
FEAT_SME is not guarded.  Not guarding means that we can have left #ifdef's in
the C files and since system_supports_poe() checks if CONFIG_ARM64_POE is
enabled, most of the code should be optimised away anyway. So unless there's a
good reason I think it makes sense to stay this way.

> 
> >  
> >  static inline unsigned int thread_get_vl(struct thread_struct *thread,
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 9e8999592f3a..62c399811dbf 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -1064,6 +1064,9 @@
> >  #define POE_RXW		UL(0x7)
> >  #define POE_MASK	UL(0xf)
> >  
> > +/* Initial value for Permission Overlay Extension for EL0 */
> > +#define POR_EL0_INIT	POE_RXW
> 
> The idea behind POE_RXW as the init value is to be all permissive ?

Yup, the default index 0, needs to allow everything.

> 
> > +
> >  #define ARM64_FEATURE_FIELD_BITS	4
> >  
> >  /* Defined for compatibility only, do not add new users. */
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 4ae31b7af6c3..0ffaca98bed6 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -271,12 +271,23 @@ static void flush_tagged_addr_state(void)
> >  		clear_thread_flag(TIF_TAGGED_ADDR);
> >  }
> >  
> > +static void flush_poe(void)
> > +{
> > +	if (!system_supports_poe())
> > +		return;
> > +
> > +	write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> > +	/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> > +	isb();
> > +}
> > +
> >  void flush_thread(void)
> >  {
> >  	fpsimd_flush_thread();
> >  	tls_thread_flush();
> >  	flush_ptrace_hw_breakpoint(current);
> >  	flush_tagged_addr_state();
> > +	flush_poe();
> >  }
> >  
> >  void arch_release_task_struct(struct task_struct *tsk)
> > @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> >  		if (system_supports_tpidr2())
> >  			p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
> >  
> > +		if (system_supports_poe())
> > +			p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> > +
> >  		if (stack_start) {
> >  			if (is_compat_thread(task_thread_info(p)))
> >  				childregs->compat_sp = stack_start;
> > @@ -495,6 +509,19 @@ static void erratum_1418040_new_exec(void)
> >  	preempt_enable();
> >  }
> >  
> > +static void permission_overlay_switch(struct task_struct *next)
> > +{
> > +	if (!system_supports_poe())
> > +		return;
> > +
> > +	current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> > +	if (current->thread.por_el0 != next->thread.por_el0) {
> > +		write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
> > +		/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> > +		isb();
> > +	}
> > +}
> > +
> >  /*
> >   * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore
> >   * this function must be called with preemption disabled and the update to
> > @@ -530,6 +557,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
> >  	ssbs_thread_switch(next);
> >  	erratum_1418040_thread_switch(next);
> >  	ptrauth_thread_switch_user(next);
> > +	permission_overlay_switch(next);
> >  
> >  	/*
> >  	 * Complete any pending TLB or cache maintenance on this CPU in case
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-11  9:50               ` Joey Gouly
@ 2024-07-18 14:45                 ` Szabolcs Nagy
  0 siblings, 0 replies; 146+ messages in thread
From: Szabolcs Nagy @ 2024-07-18 14:45 UTC (permalink / raw)
  To: Joey Gouly, Catalin Marinas
  Cc: Florian Weimer, dave.hansen, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, tglx, will, x86, kvmarm, yury.khrustalev

The 07/11/2024 10:50, Joey Gouly wrote:
> On Mon, Jul 08, 2024 at 06:53:18PM +0100, Catalin Marinas wrote:
> > On Mon, Jun 17, 2024 at 03:51:35PM +0100, Szabolcs Nagy wrote:
> > > to me it makes sense to have abstract
> > > 
> > > PKEY_DISABLE_READ
> > > PKEY_DISABLE_WRITE
> > > PKEY_DISABLE_EXECUTE
> > > PKEY_DISABLE_ACCESS
> > > 
> > > where access is handled like
> > > 
> > > if (flags&PKEY_DISABLE_ACCESS)
> > > 	flags |= PKEY_DISABLE_READ|PKEY_DISABLE_WRITE;
> > > disable_read = flags&PKEY_DISABLE_READ;
> > > disable_write = flags&PKEY_DISABLE_WRITE;
> > > disable_exec = flags&PKEY_DISABLE_EXECUTE;
...
> > On powerpc, PKEY_DISABLE_ACCESS also disables execution. AFAICT, the
...
> Seems to me that PKEY_DISABLE_ACCESS leaves exec permissions as-is.

assuming this is right the patch below looks
reasonable to me. thanks.

> Here is the patch I am planning to include in the next version of the series.
> This should support all PKEY_DISABLE_* combinations. Any comments? 
> 
> commit ba51371a544f6b0a4a0f03df62ad894d53f5039b
> Author: Joey Gouly <joey.gouly@arm.com>
> Date:   Thu Jul 4 11:29:20 2024 +0100
> 
>     arm64: add PKEY_DISABLE_READ and PKEY_DISABLE_EXEC

it's PKEY_DISABLE_EXECUTE (fwiw i like the shorter
exec better but ppc seems to use execute)

>     
>     TODO
>     
>     Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> 
> diff --git arch/arm64/include/uapi/asm/mman.h arch/arm64/include/uapi/asm/mman.h
> index 1e6482a838e1..e7e0c8216243 100644
> --- arch/arm64/include/uapi/asm/mman.h
> +++ arch/arm64/include/uapi/asm/mman.h
> @@ -7,4 +7,13 @@
>  #define PROT_BTI       0x10            /* BTI guarded page */
>  #define PROT_MTE       0x20            /* Normal Tagged mapping */
>  
> +/* Override any generic PKEY permission defines */
> +#define PKEY_DISABLE_EXECUTE   0x4
> +#define PKEY_DISABLE_READ      0x8
> +#undef PKEY_ACCESS_MASK
> +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> +                               PKEY_DISABLE_WRITE  |\
> +                               PKEY_DISABLE_READ   |\
> +                               PKEY_DISABLE_EXECUTE)
> +
>  #endif /* ! _UAPI__ASM_MMAN_H */
> diff --git arch/arm64/mm/mmu.c arch/arm64/mm/mmu.c
> index 68afe5fc3071..ce4cc6bdee4e 100644
> --- arch/arm64/mm/mmu.c
> +++ arch/arm64/mm/mmu.c
> @@ -1570,10 +1570,15 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long i
>                 return -EINVAL;
>  
>         /* Set the bits we need in POR:  */
> +       new_por = POE_RXW;
> +       if (init_val & PKEY_DISABLE_WRITE)
> +               new_por &= ~POE_W;
>         if (init_val & PKEY_DISABLE_ACCESS)
> -               new_por = POE_X;
> -       else if (init_val & PKEY_DISABLE_WRITE)
> -               new_por = POE_RX;
> +               new_por &= ~POE_RW;
> +       if (init_val & PKEY_DISABLE_READ)
> +               new_por &= ~POE_R;
> +       if (init_val & PKEY_DISABLE_EXECUTE)
> +               new_por &= ~POE_X;
>  
>         /* Shift the bits in to the correct place in POR for pkey: */
>         pkey_shift = pkey * POR_BITS_PER_PKEY;
> 
> 
> 
> Thanks,
> Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
                     ` (2 preceding siblings ...)
  2024-07-09 13:08   ` Kevin Brodsky
@ 2024-07-22  9:16   ` Anshuman Khandual
  2024-07-25 16:00   ` Dave Martin
  4 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-22  9:16 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm



On 5/3/24 18:31, Joey Gouly wrote:
> Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
>  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> index 8a45b7a411e0..e4cba8a6c9a2 100644
> --- a/arch/arm64/include/uapi/asm/sigcontext.h
> +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> @@ -98,6 +98,13 @@ struct esr_context {
>  	__u64 esr;
>  };
>  
> +#define POE_MAGIC	0x504f4530
> +
> +struct poe_context {
> +	struct _aarch64_ctx head;
> +	__u64 por_el0;
> +};
> +
>  /*
>   * extra_context: describes extra space in the signal frame for
>   * additional structures that don't fit in sigcontext.__reserved[].
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 4a77f4976e11..077436a8bc10 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -63,6 +63,7 @@ struct rt_sigframe_user_layout {
>  	unsigned long fpmr_offset;
>  	unsigned long extra_offset;
>  	unsigned long end_offset;
> +	unsigned long poe_offset;
>  };
>  
>  #define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
> @@ -185,6 +186,8 @@ struct user_ctxs {
>  	u32 zt_size;
>  	struct fpmr_context __user *fpmr;
>  	u32 fpmr_size;
> +	struct poe_context __user *poe;
> +	u32 poe_size;
>  };
>  
>  static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
> @@ -258,6 +261,21 @@ static int restore_fpmr_context(struct user_ctxs *user)
>  	return err;
>  }
>  
> +static int restore_poe_context(struct user_ctxs *user)
> +{
> +	u64 por_el0;
> +	int err = 0;
> +
> +	if (user->poe_size != sizeof(*user->poe))
> +		return -EINVAL;
> +
> +	__get_user_error(por_el0, &(user->poe->por_el0), err);
> +	if (!err)
> +		write_sysreg_s(por_el0, SYS_POR_EL0);
> +
> +	return err;
> +}
> +
>  #ifdef CONFIG_ARM64_SVE
>  
>  static int preserve_sve_context(struct sve_context __user *ctx)
> @@ -621,6 +639,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
>  	user->za = NULL;
>  	user->zt = NULL;
>  	user->fpmr = NULL;
> +	user->poe = NULL;
>  
>  	if (!IS_ALIGNED((unsigned long)base, 16))
>  		goto invalid;
> @@ -671,6 +690,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
>  			/* ignore */
>  			break;
>  
> +		case POE_MAGIC:
> +			if (!system_supports_poe())
> +				goto invalid;
> +
> +			if (user->poe)
> +				goto invalid;
> +
> +			user->poe = (struct poe_context __user *)head;
> +			user->poe_size = size;
> +			break;
> +
>  		case SVE_MAGIC:
>  			if (!system_supports_sve() && !system_supports_sme())
>  				goto invalid;
> @@ -857,6 +887,9 @@ static int restore_sigframe(struct pt_regs *regs,
>  	if (err == 0 && system_supports_sme2() && user.zt)
>  		err = restore_zt_context(&user);
>  
> +	if (err == 0 && system_supports_poe() && user.poe)
> +		err = restore_poe_context(&user);
> +
>  	return err;
>  }
>  
> @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>  			return err;
>  	}
>  
> +	if (system_supports_poe()) {
> +		err = sigframe_alloc(user, &user->poe_offset,
> +				     sizeof(struct poe_context));
> +		if (err)
> +			return err;
> +	}
> +
>  	return sigframe_alloc_end(user);
>  }
>  
> @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
>  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
>  	}
>  
> +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> +		struct poe_context __user *poe_ctx =
> +			apply_user_offset(user, user->poe_offset);
> +
> +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> +	}
> +
>  	/* Scalable Vector Extension state (including streaming), if present */
>  	if ((system_supports_sve() || system_supports_sme()) &&
>  	    err == 0 && user->sve_offset) {
> @@ -1178,6 +1227,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
>  		sme_smstop();
>  	}
>  
> +	if (system_supports_poe())
> +		write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> +
>  	if (ka->sa.sa_flags & SA_RESTORER)
>  		sigtramp = ka->sa.sa_restorer;
>  	else

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-07-05 16:59   ` Catalin Marinas
@ 2024-07-22 13:39     ` Kevin Brodsky
  0 siblings, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-22 13:39 UTC (permalink / raw)
  To: Catalin Marinas, Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 05/07/2024 18:59, Catalin Marinas wrote:
> On Fri, May 03, 2024 at 02:01:35PM +0100, Joey Gouly wrote:
>> @@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>>  #define pte_access_permitted_no_overlay(pte, write) \
>>  	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
>>  #define pte_access_permitted(pte, write) \
>> -	pte_access_permitted_no_overlay(pte, write)
>> +	(pte_access_permitted_no_overlay(pte, write) && \
>> +	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
> I'm still not entirely convinced on checking the keys during fast GUP
> but that's what x86 and powerpc do already, so I guess we'll follow the
> same ABI.

I've thought about this some more. In summary I don't think adding this
check to pte_access_permitted() is controversial, but we should decide
how POR_EL0 is set for kernel threads.

This change essentially means that fast GUP behaves like uaccess for
pages that are already present: in both cases POR_EL0 will be looked up
based on the POIndex of the page being accessed (by the hardware in the
uaccess case, and explicitly in the fast GUP case). Fast GUP always
operates on current->mm, so to me checking POR_EL0 in
pte_access_permitted() should be no more restrictive than a uaccess
check from a user perspective. In other words, POR_EL0 is checked when
the kernel accesses user memory on the user's behalf, whether through
uaccess or GUP.

It's also worth noting that the "slow" GUP path (which
get_user_pages_fast() falls back to if a page is missing) also checks
POR_EL0 by virtue of calling handle_mm_fault(), which in turn calls
arch_vma_access_permitted(). It would be pretty inconsistent for the
slow GUP path to do a pkey check but not the fast path. (That said, the
slow GUP path does not call arch_vma_access_permitted() if a page is
already present, so callers of get_user_pages() and similar will get
inconsistent checking. Not great, that may be worth fixing - but that's
clearly beyond the scope of this series.)

Now an interesting question is what happens with kernel threads that
access user memory, as is the case for the optional io_uring kernel
thread (IORING_SETUP_SQPOLL). The discussion above holds regardless of
the type of thread, so the sqpoll thread will have its POR_EL0 checked
when processing commands that involve uaccess or GUP. AFAICT, this
series does not have special handling for kernel threads w.r.t. POR_EL0,
which means that it is left unchanged when a new kernel thread is cloned
(create_io_thread() in the IORING_SETUP_SQPOLL case). The sqpoll thread
will therefore inherit POR_EL0 from the (user) thread that calls
io_uring_setup(). In other words, the sqpoll thread ends up with the
same view of user memory as that user thread - for instance if its
POR_EL0 prevents access to POIndex 1, then any I/O that the sqpoll
thread attempts on mappings with POIndex/pkey 1 will fail.

This behaviour seems potentially useful to me, as the io_uring SQ could
easily become a way to bypass POE without some restriction. However, it
feels like this should be documented, as one should keep it in mind when
using pkeys, and there may well be other cases where kernel threads are
impacted by POR_EL0. I am also unsure how x86/ppc handle this.

Kevin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
                     ` (2 preceding siblings ...)
  2024-07-15  8:27   ` Anshuman Khandual
@ 2024-07-22 13:40   ` Kevin Brodsky
  2024-07-25 15:46   ` Dave Martin
  4 siblings, 0 replies; 146+ messages in thread
From: Kevin Brodsky @ 2024-07-22 13:40 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 03/05/2024 15:01, Joey Gouly wrote:
> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>  		if (system_supports_tpidr2())
>  			p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
>  
> +		if (system_supports_poe())
> +			p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);

This is most likely needed for kernel threads as well, because they may
be affected by POR_EL0 too (see my reply on patch 17).

Kevin

> +
>  		if (stack_start) {
>  			if (is_compat_thread(task_thread_info(p)))
>  				childregs->compat_sp = stack_start;

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
                     ` (3 preceding siblings ...)
  2024-07-09 13:07   ` Kevin Brodsky
@ 2024-07-23  4:22   ` Anshuman Khandual
  2024-07-25 16:12   ` Dave Martin
  5 siblings, 0 replies; 146+ messages in thread
From: Anshuman Khandual @ 2024-07-23  4:22 UTC (permalink / raw)
  To: Joey Gouly, linux-arm-kernel
  Cc: akpm, aneesh.kumar, aneesh.kumar, bp, broonie, catalin.marinas,
	christophe.leroy, dave.hansen, hpa, linux-fsdevel, linux-mm,
	linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On 5/3/24 18:31, Joey Gouly wrote:
> Implement the PKEYS interface, using the Permission Overlay Extension.

This commit message should contain some more details here considering
the amount of code change proposed in this patch.

> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/mmu.h         |   1 +
>  arch/arm64/include/asm/mmu_context.h |  51 ++++++++++++-
>  arch/arm64/include/asm/pgtable.h     |  22 +++++-
>  arch/arm64/include/asm/pkeys.h       | 110 +++++++++++++++++++++++++++
>  arch/arm64/include/asm/por.h         |  33 ++++++++
>  arch/arm64/mm/mmu.c                  |  40 ++++++++++
>  6 files changed, 255 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/include/asm/pkeys.h
>  create mode 100644 arch/arm64/include/asm/por.h
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 65977c7783c5..983afeb4eba5 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -25,6 +25,7 @@ typedef struct {
>  	refcount_t	pinned;
>  	void		*vdso;
>  	unsigned long	flags;
> +	u8		pkey_allocation_map;

arch_max_pkey() is 7 on arm64, with bit 0 reserved for the first pkey,
so is it possible for the entire pkey_allocation_map to be completely
used up in reality ? OR the maximum pkey bits that can be allocated is
actually ARCH_PKEY_BITS ?

>  } mm_context_t;
>  
>  /*
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index c768d16b81a4..cb499db7a97b 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -15,12 +15,12 @@
>  #include <linux/sched/hotplug.h>
>  #include <linux/mm_types.h>
>  #include <linux/pgtable.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/cpufeature.h>
>  #include <asm/daifflags.h>
>  #include <asm/proc-fns.h>
> -#include <asm-generic/mm_hooks.h>
>  #include <asm/cputype.h>
>  #include <asm/sysreg.h>
>  #include <asm/tlbflush.h>
> @@ -175,9 +175,36 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
>  {
>  	atomic64_set(&mm->context.id, 0);
>  	refcount_set(&mm->context.pinned, 0);
> +
> +	/* pkey 0 is the default, so always reserve it. */
> +	mm->context.pkey_allocation_map = 0x1;

Very small nit. Considering the 1U << pkey allocation mechanism, the
following might make more sense, considering the first bit being the
default one.

	mm->context.pkey_allocation_map = (1U << 0);

OR probably even making it a const or something.

> +
> +	return 0;
> +}
> +
> +static inline void arch_dup_pkeys(struct mm_struct *oldmm,
> +				  struct mm_struct *mm)
> +{
> +	/* Duplicate the oldmm pkey state in mm: */
> +	mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
> +}
> +
> +static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
> +{
> +	arch_dup_pkeys(oldmm, mm);
> +
>  	return 0;
>  }
>  
> +static inline void arch_exit_mmap(struct mm_struct *mm)
> +{
> +}
> +
> +static inline void arch_unmap(struct mm_struct *mm,
> +			unsigned long start, unsigned long end)
> +{
> +}
> +
>  #ifdef CONFIG_ARM64_SW_TTBR0_PAN
>  static inline void update_saved_ttbr0(struct task_struct *tsk,
>  				      struct mm_struct *mm)
> @@ -267,6 +294,28 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
>  	return -1UL >> 8;
>  }
>  
> +/*
> + * We only want to enforce protection keys on the current process
> + * because we effectively have no access to POR_EL0 for other
> + * processes or any way to tell *which * POR_EL0 in a threaded
> + * process we could use.
> + *
> + * So do not enforce things if the VMA is not from the current
> + * mm, or if we are in a kernel thread.
> + */

As mentioned in the other thread, this comment can be improved.

> +static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +		bool write, bool execute, bool foreign)
> +{
> +	if (!arch_pkeys_enabled())
> +		return true;
> +
> +	/* allow access if the VMA is not one from this process */
> +	if (foreign || vma_is_foreign(vma))
> +		return true;
> +
> +	return por_el0_allows_pkey(vma_pkey(vma), write, execute);
> +}
> +
>  #include <asm-generic/mmu_context.h>
>  
>  #endif /* !__ASSEMBLY__ */
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 2449e4e27ea6..8ee68ff03016 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -34,6 +34,7 @@
>  
>  #include <asm/cmpxchg.h>
>  #include <asm/fixmap.h>
> +#include <asm/por.h>
>  #include <linux/mmdebug.h>
>  #include <linux/mm_types.h>
>  #include <linux/sched.h>
> @@ -153,6 +154,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>  #define pte_accessible(mm, pte)	\
>  	(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
>  
> +static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
> +{
> +	u64 por;
> +
> +	if (!system_supports_poe())
> +		return true;

This is redundant. Same check is there in arch_vma_access_permitted()
as well which is the sole caller for this function.

> +
> +	por = read_sysreg_s(SYS_POR_EL0);
> +
> +	if (write)
> +		return por_elx_allows_write(por, pkey);
> +
> +	if (execute)
> +		return por_elx_allows_exec(por, pkey);
> +
> +	return por_elx_allows_read(por, pkey);
> +}
> +
>  /*
>   * p??_access_permitted() is true for valid user mappings (PTE_USER
>   * bit set, subject to the write permission check). For execute-only
> @@ -163,7 +182,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>  #define pte_access_permitted_no_overlay(pte, write) \
>  	(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
>  #define pte_access_permitted(pte, write) \
> -	pte_access_permitted_no_overlay(pte, write)
> +	(pte_access_permitted_no_overlay(pte, write) && \
> +	por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
>  #define pmd_access_permitted(pmd, write) \
>  	(pte_access_permitted(pmd_pte(pmd), (write)))
>  #define pud_access_permitted(pud, write) \
> diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> new file mode 100644
> index 000000000000..a284508a4d02
> --- /dev/null
> +++ b/arch/arm64/include/asm/pkeys.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + *
> + * Based on arch/x86/include/asm/pkeys.h
> + */
> +
> +#ifndef _ASM_ARM64_PKEYS_H
> +#define _ASM_ARM64_PKEYS_H
> +
> +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
> +
> +#define arch_max_pkey() 7

May be this should be made 8 including the default pkey bit 0.

> +
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> +		unsigned long init_val);
> +
> +static inline bool arch_pkeys_enabled(void)
> +{
> +	return false;
> +}
> +
> +static inline int vma_pkey(struct vm_area_struct *vma)
> +{
> +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> +}
> +
> +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey)
> +{

Following comment is there in x86 __arch_override_mprotect_pkey() which
also seems to be applicable here as well. Please consider adding.

        /*
         * Is this an mprotect_pkey() call?  If so, never
         * override the value that came from the user.
         */

> +	if (pkey != -1)
> +		return pkey;
> +
> +	return vma_pkey(vma);
> +}
> +
> +static inline int execute_only_pkey(struct mm_struct *mm)
> +{
> +	// Execute-only mappings are handled by EPAN/FEAT_PAN3.
> +	WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN));
> +
> +	return -1;
> +}
> +
> +#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)
> +#define mm_set_pkey_allocated(mm, pkey) do {		\
> +	mm_pkey_allocation_map(mm) |= (1U << pkey);	\
> +} while (0)
> +#define mm_set_pkey_free(mm, pkey) do {			\
> +	mm_pkey_allocation_map(mm) &= ~(1U << pkey);	\
> +} while (0)
> +
> +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> +{
> +	/*
> +	 * "Allocated" pkeys are those that have been returned
> +	 * from pkey_alloc() or pkey 0 which is allocated
> +	 * implicitly when the mm is created.
> +	 */
> +	if (pkey < 0)
> +		return false;
> +	if (pkey >= arch_max_pkey())
> +		return false;

These range checks can be folded into the same conditional statement.

> +
> +	return mm_pkey_allocation_map(mm) & (1U << pkey);
> +}
> +
> +/*
> + * Returns a positive, 3-bit key on success, or -1 on failure.
> + */
> +static inline int mm_pkey_alloc(struct mm_struct *mm)
> +{
> +	/*
> +	 * Note: this is the one and only place we make sure
> +	 * that the pkey is valid as far as the hardware is
> +	 * concerned.  The rest of the kernel trusts that
> +	 * only good, valid pkeys come out of here.
> +	 */
> +	u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1);
> +	int ret;
> +
> +	if (!arch_pkeys_enabled())
> +		return -1;

I am wondering should not pkey's range be asserted here first
like as in mm_pkey_is_allocated() ?

> +
> +	/*
> +	 * Are we out of pkeys?  We must handle this specially
> +	 * because ffz() behavior is undefined if there are no
> +	 * zeros.
> +	 */
> +	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
> +		return -1;
> +
> +	ret = ffz(mm_pkey_allocation_map(mm));
> +
> +	mm_set_pkey_allocated(mm, ret);
> +
> +	return ret;
> +}
> +
> +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> +{
> +	if (!mm_pkey_is_allocated(mm, pkey))
> +		return -EINVAL;
> +
> +	mm_set_pkey_free(mm, pkey);
> +
> +	return 0;
> +}
> +
> +#endif /* _ASM_ARM64_PKEYS_H */
> diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
> new file mode 100644
> index 000000000000..d6604e0c5c54
> --- /dev/null
> +++ b/arch/arm64/include/asm/por.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + */
> +
> +#ifndef _ASM_ARM64_POR_H
> +#define _ASM_ARM64_POR_H
> +
> +#define POR_BITS_PER_PKEY		4
> +#define POR_ELx_IDX(por_elx, idx)	(((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf)
> +
> +static inline bool por_elx_allows_read(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_R;
> +}
> +
> +static inline bool por_elx_allows_write(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_W;
> +}
> +
> +static inline bool por_elx_allows_exec(u64 por, u8 pkey)
> +{
> +	u8 perm = POR_ELx_IDX(por, pkey);
> +
> +	return perm & POE_X;
> +}
> +
> +#endif /* _ASM_ARM64_POR_H */
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 495b732d5af3..e50ccc86d150 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -25,6 +25,7 @@
>  #include <linux/vmalloc.h>
>  #include <linux/set_memory.h>
>  #include <linux/kfence.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/barrier.h>
>  #include <asm/cputype.h>
> @@ -1535,3 +1536,42 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
>  
>  	cpu_uninstall_idmap();
>  }
> +
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> +{
> +	u64 new_por = POE_RXW;
> +	u64 old_por;
> +	u64 pkey_shift;
> +
> +	if (!arch_pkeys_enabled())
> +		return -ENOSPC;

This code path might not be possible and hence the check is redundant.
If arch_pkeys_enabled() returns negative, then pkey_alloc() will just
bail out and arch_set_user_pkey_access() would not be called afterwards.

SYSCALL..(pkey_alloc)
	mm_pkey_alloc()
		arch_pkeys_enabled()		
	...............
	arch_set_user_pkey_access()
		arch_pkeys_enabled()
> +
> +	/*
> +	 * This code should only be called with valid 'pkey'
> +	 * values originating from in-kernel users.  Complain
> +	 * if a bad value is observed.
> +	 */
> +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> +		return -EINVAL;

If the pkey's range check could have been done in mm_pkey_alloc() itself
- which seems to be a better place instead, this warning here would not
have been necessary.

> +
> +	/* Set the bits we need in POR:  */
> +	if (init_val & PKEY_DISABLE_ACCESS)
> +		new_por = POE_X;
> +	else if (init_val & PKEY_DISABLE_WRITE)
> +		new_por = POE_RX;
> +
> +	/* Shift the bits in to the correct place in POR for pkey: */
> +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> +	new_por <<= pkey_shift;
> +
> +	/* Get old POR and mask off any old bits in place: */
> +	old_por = read_sysreg_s(SYS_POR_EL0);
> +	old_por &= ~(POE_MASK << pkey_shift);
> +
> +	/* Write old part along with new part: */
> +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> +
> +	return 0;
> +}
> +#endif

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2
  2024-05-03 13:01 ` [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2 Joey Gouly
  2024-07-15  7:47   ` Anshuman Khandual
@ 2024-07-25 15:44   ` Dave Martin
  2024-08-06 10:04     ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:44 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Fri, May 03, 2024 at 02:01:22PM +0100, Joey Gouly wrote:
> Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/el2_setup.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index b7afaa026842..df5614be4b70 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -184,12 +184,20 @@
>  .Lset_pie_fgt_\@:
>  	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
>  	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
> -	cbz	x1, .Lset_fgt_\@
> +	cbz	x1, .Lset_poe_fgt_\@
>  
>  	/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
>  	orr	x0, x0, #HFGxTR_EL2_nPIR_EL1
>  	orr	x0, x0, #HFGxTR_EL2_nPIRE0_EL1
>  
> +.Lset_poe_fgt_\@:
> +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
> +	cbz	x1, .Lset_fgt_\@
> +
> +	/* Disable trapping of POR_EL0 */
> +	orr	x0, x0, #HFGxTR_EL2_nPOR_EL0

Do I understand correctly that this is just to allow the host to access
its own POR_EL0, before (or unless) KVM starts up?

KVM always overrides all the EL2 trap controls while running a guest,
right?  We don't want this bit still set when running in a guest just
because KVM doesn't know about POE yet.

(Hopefully this follows naturally from the way the KVM code works, but
my KVM-fu is a bit rusty.)

Also, what about POR_EL1?  Do we have to reset that to something sane
(and so untrap it here), or it is sufficient if we never turn on POE
support in the host, via TCR2_EL1.POE?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register
  2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
                     ` (3 preceding siblings ...)
  2024-07-22 13:40   ` Kevin Brodsky
@ 2024-07-25 15:46   ` Dave Martin
  4 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:46 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:24PM +0100, Joey Gouly wrote:
> POR_EL0 is a register that can be modified by userspace directly,
> so it must be context switched.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/cpufeature.h |  6 ++++++
>  arch/arm64/include/asm/processor.h  |  1 +
>  arch/arm64/include/asm/sysreg.h     |  3 +++
>  arch/arm64/kernel/process.c         | 28 ++++++++++++++++++++++++++++
>  4 files changed, 38 insertions(+)

[...]

> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4ae31b7af6c3..0ffaca98bed6 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -271,12 +271,23 @@ static void flush_tagged_addr_state(void)
>  		clear_thread_flag(TIF_TAGGED_ADDR);
>  }
>  
> +static void flush_poe(void)
> +{
> +	if (!system_supports_poe())
> +		return;
> +
> +	write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> +	/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> +	isb();

See my comment on permission_overlay_switch(), below.  However, exec is
slower path code, so including the ISB may be better here than leaving
it for the caller to worry about.

> +}
> +
>  void flush_thread(void)
>  {
>  	fpsimd_flush_thread();
>  	tls_thread_flush();
>  	flush_ptrace_hw_breakpoint(current);
>  	flush_tagged_addr_state();
> +	flush_poe();
>  }
>  
>  void arch_release_task_struct(struct task_struct *tsk)
> @@ -371,6 +382,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>  		if (system_supports_tpidr2())
>  			p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
>  
> +		if (system_supports_poe())
> +			p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> +

Was POR_EL0 ever reset to something sensible at all?  Does it matter?

(I couldn't find this, but may have missed it.)

>  		if (stack_start) {
>  			if (is_compat_thread(task_thread_info(p)))
>  				childregs->compat_sp = stack_start;
> @@ -495,6 +509,19 @@ static void erratum_1418040_new_exec(void)
>  	preempt_enable();
>  }
>  
> +static void permission_overlay_switch(struct task_struct *next)
> +{
> +	if (!system_supports_poe())
> +		return;
> +
> +	current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> +	if (current->thread.por_el0 != next->thread.por_el0) {
> +		write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
> +		/* ISB required for kernel uaccess routines when chaning POR_EL0 */
> +		isb();

Do we really need an extra ISB slap in the middle of context switch?

(i.e., should any uaccess ever happen until context switch is completed,
and so can we coalesce this ISB with a later one?)

> +	}
> +}
> +
>  /*
>   * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore
>   * this function must be called with preemption disabled and the update to
> @@ -530,6 +557,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  	ssbs_thread_switch(next);
>  	erratum_1418040_thread_switch(next);
>  	ptrauth_thread_switch_user(next);
> +	permission_overlay_switch(next);
>  
>  	/*
>  	 * Complete any pending TLB or cache maintenance on this CPU in case

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE
  2024-07-16 10:47   ` Anshuman Khandual
@ 2024-07-25 15:48     ` Dave Martin
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:48 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Jul 16, 2024 at 04:17:12PM +0530, Anshuman Khandual wrote:
> 
> 
> On 5/3/24 18:31, Joey Gouly wrote:
> > Now that PKEYs support has been implemented, enable it for CPUs that
> > support S1POE.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
> 
> > ---
> >  arch/arm64/include/asm/pkeys.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> > index a284508a4d02..3ea928ec94c0 100644
> > --- a/arch/arm64/include/asm/pkeys.h
> > +++ b/arch/arm64/include/asm/pkeys.h
> > @@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> >  
> >  static inline bool arch_pkeys_enabled(void)
> >  {
> > -	return false;
> > +	return system_supports_poe();
> >  }
> >  
> >  static inline int vma_pkey(struct vm_area_struct *vma)
> 
> Small nit. Would it better to be consistently using system_supports_poe()
> helper rather than arch_pkeys_enabled() inside arch/arm64/ platform code
> like - during POE fault handling i.e inside fault_from_pkey().
> 

(FWIW, arch_pkeys_enabled() looks like the hook for the arch to tell
the pkeys generic code whether the arch support is there, so I guess
the proposed change looks sensible to me.

For the arch backend code that is agnostic to whether pkeys is actually
in use, system_supports_poe() seems to be the more appropriate check.)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
                     ` (2 preceding siblings ...)
  2024-07-15 20:16   ` Mark Brown
@ 2024-07-25 15:49   ` Dave Martin
  2024-08-01 16:04     ` Joey Gouly
  3 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:49 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote:
> Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
> check if the CPU supports the feature.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> This takes the last bit of HWCAP2, is this fine? What can we do about more features in the future?
> 
> 
>  Documentation/arch/arm64/elf_hwcaps.rst |  2 ++
>  arch/arm64/include/asm/hwcap.h          |  1 +
>  arch/arm64/include/uapi/asm/hwcap.h     |  1 +
>  arch/arm64/kernel/cpufeature.c          | 14 ++++++++++++++
>  arch/arm64/kernel/cpuinfo.c             |  1 +
>  5 files changed, 19 insertions(+)
> 
> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> index 448c1664879b..694f67fa07d1 100644
> --- a/Documentation/arch/arm64/elf_hwcaps.rst
> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> @@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
>  HWCAP2_SME_SF8DP4
>      Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
>  
> +HWCAP2_POE
> +    Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.

Nit: unintentionally dropped blank line before the section heading?

>  
>  4. Unused AT_HWCAP bits
>  -----------------------

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
  2024-05-28  6:54   ` Amit Daniel Kachhap
  2024-07-16  9:05   ` Anshuman Khandual
@ 2024-07-25 15:49   ` Dave Martin
  2024-08-01 10:55     ` Joey Gouly
  2 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:49 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:31PM +0100, Joey Gouly wrote:
> Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> value is set in the vm_flags and then into the pgprot value.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/mman.h | 8 +++++++-
>  arch/arm64/mm/mmap.c          | 9 +++++++++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> index 5966ee4a6154..ecb2d18dc4d7 100644
> --- a/arch/arm64/include/asm/mman.h
> +++ b/arch/arm64/include/asm/mman.h
> @@ -7,7 +7,7 @@
>  #include <uapi/asm/mman.h>
>  
>  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> -	unsigned long pkey __always_unused)
> +	unsigned long pkey)
>  {
>  	unsigned long ret = 0;
>  
> @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>  	if (system_supports_mte() && (prot & PROT_MTE))
>  		ret |= VM_MTE;
>  
> +#if defined(CONFIG_ARCH_HAS_PKEYS)
> +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;

Out of interest, is this as bad as it looks or does the compiler turn
it into a shift and mask?


> +#endif
> +
>  	return ret;
>  }
>  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> index 642bdf908b22..86eda6bc7893 100644
> --- a/arch/arm64/mm/mmap.c
> +++ b/arch/arm64/mm/mmap.c
> @@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
>  	if (vm_flags & VM_MTE)
>  		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
>  
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +	if (vm_flags & VM_PKEY_BIT0)
> +		prot |= PTE_PO_IDX_0;
> +	if (vm_flags & VM_PKEY_BIT1)
> +		prot |= PTE_PO_IDX_1;
> +	if (vm_flags & VM_PKEY_BIT2)
> +		prot |= PTE_PO_IDX_2;
> +#endif
> +

Ditto.  At least we only have three bits to cope with either way.

I'm guessing that these functions are not super-hot path.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
                     ` (2 preceding siblings ...)
  2024-07-16 10:13   ` Anshuman Khandual
@ 2024-07-25 15:57   ` Dave Martin
  2024-08-01 16:01     ` Joey Gouly
  3 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:57 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> If a memory fault occurs that is due to an overlay/pkey fault, report that to
> userspace with a SEGV_PKUERR.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/traps.h |  1 +
>  arch/arm64/kernel/traps.c      | 12 ++++++--
>  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
>  3 files changed, 64 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> index eefe766d6161..f6f6f2cb7f10 100644
> --- a/arch/arm64/include/asm/traps.h
> +++ b/arch/arm64/include/asm/traps.h
> @@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
>  void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
>  void arm64_notify_segfault(unsigned long addr);
>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
>  
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 215e6d7f2df8..1bac6c84d3f5 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
>  	__show_regs(regs);
>  }
>  
> -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> -			   const char *str)
> +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> +			   const char *str, int pkey)
>  {
>  	arm64_show_signal(signo, str);
>  	if (signo == SIGKILL)
>  		force_sig(SIGKILL);
> +	else if (code == SEGV_PKUERR)
> +		force_sig_pkuerr((void __user *)far, pkey);

Is signo definitely SIGSEGV here?  It looks to me like we can get in
here for SIGBUS, SIGTRAP etc.

si_codes are not unique between different signo here, so I'm wondering
whether this should this be:

	else if (signo == SIGSEGV && code == SEGV_PKUERR)

...?


>  	else
>  		force_sig_fault(signo, code, (void __user *)far);
>  }
>  
> +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> +			   const char *str)
> +{
> +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);

Is there a reason not to follow the same convention as elsewhere, where
-1 is passed for "no pkey"?

If we think this should never be called with signo == SIGSEGV &&
code == SEGV_PKUERR and no valid pkey but if it's messy to prove, then
maybe a WARN_ON_ONCE() would be worth it here?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-06-03  9:21       ` Amit Daniel Kachhap
@ 2024-07-25 15:58         ` Dave Martin
  2024-07-25 18:11           ` Mark Brown
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 15:58 UTC (permalink / raw)
  To: Amit Daniel Kachhap
  Cc: Mark Brown, Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, catalin.marinas, christophe.leroy, dave.hansen,
	hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Mon, Jun 03, 2024 at 02:51:46PM +0530, Amit Daniel Kachhap wrote:
> 
> 
> On 5/31/24 22:09, Mark Brown wrote:
> > On Tue, May 28, 2024 at 12:26:54PM +0530, Amit Daniel Kachhap wrote:
> > > On 5/3/24 18:31, Joey Gouly wrote:
> > 
> > > > +#define POE_MAGIC	0x504f4530
> > > > +struct poe_context {
> > > > +	struct _aarch64_ctx head;
> > > > +	__u64 por_el0;
> > > > +};
> > 
> > > There is a comment section in the beginning which mentions the size
> > > of the context frame structure and subsequent reduction in the
> > > reserved range. So this new context description can be added there.
> > > Although looks like it is broken for za, zt and fpmr context.
> > 
> > Could you be more specific about how you think these existing contexts
> > are broken?  The above looks perfectly good and standard and the
> > existing contexts do a reasonable simulation of working.  Note that the
> > ZA and ZT contexts don't generate data payload unless userspace has set
> > PSTATE.ZA.
> 
> Sorry for not being clear on this as I was only referring to the
> comments in file arch/arm64/include/uapi/asm/sigcontext.h and no code
> as such is broken.
> 
>  * Allocation of __reserved[]:
>  * (Note: records do not necessarily occur in the order shown here.)
>  *
>  *      size            description
>  *
>  *      0x210           fpsimd_context
>  *       0x10           esr_context
>  *      0x8a0           sve_context (vl <= 64) (optional)
>  *       0x20           extra_context (optional)
>  *       0x10           terminator (null _aarch64_ctx)
>  *
>  *      0x510           (reserved for future allocation)
> 
> Here I think that optional context like za, zt, fpmr and poe should have
> size mentioned here to make the description consistent.As you said ZA
> and ZT context are enabled by userspace so some extra details can be
> added for them too.

Regarding this, __reserved[] is looking very full now.

I'll post a draft patch separately, since I think the update could
benefit from separate discussion, but my back-of-the-envelope
calculation suggests that (before this patch) we are down to 0x90
bytes of free space (i.e., over 96% full).


I wonder whether it is time to start pushing back on adding a new
_foo_context for every individual register, though?

Maybe we could add some kind of _misc_context for miscellaneous 64-bit
regs.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
                     ` (3 preceding siblings ...)
  2024-07-22  9:16   ` Anshuman Khandual
@ 2024-07-25 16:00   ` Dave Martin
  2024-08-01 15:54     ` Joey Gouly
  4 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-25 16:00 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> ---
>  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
>  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> index 8a45b7a411e0..e4cba8a6c9a2 100644
> --- a/arch/arm64/include/uapi/asm/sigcontext.h
> +++ b/arch/arm64/include/uapi/asm/sigcontext.h

[...]

> @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>  			return err;
>  	}
>  
> +	if (system_supports_poe()) {
> +		err = sigframe_alloc(user, &user->poe_offset,
> +				     sizeof(struct poe_context));
> +		if (err)
> +			return err;
> +	}
> +
>  	return sigframe_alloc_end(user);
>  }
>  
> @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
>  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
>  	}
>  
> +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> +		struct poe_context __user *poe_ctx =
> +			apply_user_offset(user, user->poe_offset);
> +
> +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> +	}
> +

Does the AArch64 procedure call standard say anything about whether
POR_EL0 is caller-saved?

<bikeshed>

In theory we could skip saving this register if it is already
POR_EL0_INIT (which it often will be), and if the signal handler is not
supposed to modify and leave the modified value in the register when
returning.

The complexity of the additional check my be a bit pointless though,
and the the handler might theoretically want to change the interrupted
code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
sometimes there and sometimes not.

</bikeshed>

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE
  2024-05-03 13:01 ` [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE Joey Gouly
  2024-07-16 10:47   ` Anshuman Khandual
@ 2024-07-25 16:00   ` Dave Martin
  1 sibling, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-07-25 16:00 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Fri, May 03, 2024 at 02:01:37PM +0100, Joey Gouly wrote:
> Now that PKEYs support has been implemented, enable it for CPUs that
> support S1POE.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/pkeys.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> index a284508a4d02..3ea928ec94c0 100644
> --- a/arch/arm64/include/asm/pkeys.h
> +++ b/arch/arm64/include/asm/pkeys.h
> @@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
>  
>  static inline bool arch_pkeys_enabled(void)
>  {
> -	return false;
> +	return system_supports_poe();
>  }

Nit: maybe push this later in the series, at least to after the POE/PIE
patch, since pkeys won't work right otherwise on PIE-enabled platforms?

(I know it makes no difference without final Kconfig update, but it
feels more logical.)

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 17/29] arm64: implement PKEYS support
  2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
                     ` (4 preceding siblings ...)
  2024-07-23  4:22   ` Anshuman Khandual
@ 2024-07-25 16:12   ` Dave Martin
  5 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-07-25 16:12 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Fri, May 03, 2024 at 02:01:35PM +0100, Joey Gouly wrote:
> Implement the PKEYS interface, using the Permission Overlay Extension.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/mmu.h         |   1 +
>  arch/arm64/include/asm/mmu_context.h |  51 ++++++++++++-
>  arch/arm64/include/asm/pgtable.h     |  22 +++++-
>  arch/arm64/include/asm/pkeys.h       | 110 +++++++++++++++++++++++++++
>  arch/arm64/include/asm/por.h         |  33 ++++++++
>  arch/arm64/mm/mmu.c                  |  40 ++++++++++
>  6 files changed, 255 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/include/asm/pkeys.h
>  create mode 100644 arch/arm64/include/asm/por.h

[...]

> diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
> new file mode 100644
> index 000000000000..a284508a4d02
> --- /dev/null
> +++ b/arch/arm64/include/asm/pkeys.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + *
> + * Based on arch/x86/include/asm/pkeys.h
> + */
> +
> +#ifndef _ASM_ARM64_PKEYS_H
> +#define _ASM_ARM64_PKEYS_H
> +
> +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
> +
> +#define arch_max_pkey() 7

Did you mean 8 ?  I'm guessing this may be the "off by one error" you
alluded to in your own reply to the cover letter, but just in case...

(x86 and powerpc seem to have booby-trapped the name of this macro for
the unwary...)

See also mm_pkey_{is_allocated,alloc}().

> +
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> +		unsigned long init_val);
> +
> +static inline bool arch_pkeys_enabled(void)
> +{
> +	return false;
> +}
> +
> +static inline int vma_pkey(struct vm_area_struct *vma)
> +{
> +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> +}
> +
> +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey)
> +{
> +	if (pkey != -1)
> +		return pkey;
> +
> +	return vma_pkey(vma);
> +}
> +
> +static inline int execute_only_pkey(struct mm_struct *mm)
> +{
> +	// Execute-only mappings are handled by EPAN/FEAT_PAN3.
> +	WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN));
> +
> +	return -1;
> +}
> +
> +#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)

Pedantic nit: (mm)

although other arches have the same nit already, and it's probably low
risk given the scope and usage of these macros.

(Also, the outer parentheses are also redundant (if harmless).)

> +#define mm_set_pkey_allocated(mm, pkey) do {		\
> +	mm_pkey_allocation_map(mm) |= (1U << pkey);	\
> +} while (0)
> +#define mm_set_pkey_free(mm, pkey) do {			\
> +	mm_pkey_allocation_map(mm) &= ~(1U << pkey);	\
> +} while (0)
> +
> +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> +{
> +	/*
> +	 * "Allocated" pkeys are those that have been returned
> +	 * from pkey_alloc() or pkey 0 which is allocated
> +	 * implicitly when the mm is created.
> +	 */
> +	if (pkey < 0)
> +		return false;
> +	if (pkey >= arch_max_pkey())
> +		return false;

Did you mean > ?

> +
> +	return mm_pkey_allocation_map(mm) & (1U << pkey);
> +}
> +
> +/*
> + * Returns a positive, 3-bit key on success, or -1 on failure.
> + */
> +static inline int mm_pkey_alloc(struct mm_struct *mm)
> +{
> +	/*
> +	 * Note: this is the one and only place we make sure
> +	 * that the pkey is valid as far as the hardware is
> +	 * concerned.  The rest of the kernel trusts that
> +	 * only good, valid pkeys come out of here.
> +	 */
> +	u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1);

Nit: redundant outer ().

Also, GENMASK() and friends might be cleaner that spelling out this
idiom explicitly (but no big deal).

(1 << 7) - 1 is 0x7f, which doesn't feel right if pkeys 0..7 are all
supposed to be valid.  (See arch_max_pkey() above.)


(Also it looks mildly weird to have this before checking
arch_pkeys_enabled(), but since this is likely to be constant-folded by
the compiler, I guess it almost certainly makes no difference.  It's
harmless either way.)

> +	int ret;
> +
> +	if (!arch_pkeys_enabled())
> +		return -1;
> +
> +	/*
> +	 * Are we out of pkeys?  We must handle this specially
> +	 * because ffz() behavior is undefined if there are no
> +	 * zeros.
> +	 */
> +	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
> +		return -1;
> +
> +	ret = ffz(mm_pkey_allocation_map(mm));
> +
> +	mm_set_pkey_allocated(mm, ret);
> +
> +	return ret;
> +}
> +
> +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> +{
> +	if (!mm_pkey_is_allocated(mm, pkey))
> +		return -EINVAL;

Does anything prevent a pkey_free(0)?

I couldn't find any check related to this so far.

If not, this may be a generic problem, better solved through a wrapper
in the generic mm code.

Userspace has to have at least one PKEY allocated, since the pte field
has to be set to something...  unless we turn PKEYs on or off per mm.
But the pkeys API doesn't seem to be designed that way (and it doesn't
look very useful).


> +
> +	mm_set_pkey_free(mm, pkey);
> +
> +	return 0;
> +}
> +
> +#endif /* _ASM_ARM64_PKEYS_H */
> diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
> new file mode 100644
> index 000000000000..d6604e0c5c54
> --- /dev/null
> +++ b/arch/arm64/include/asm/por.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 Arm Ltd.
> + */
> +
> +#ifndef _ASM_ARM64_POR_H
> +#define _ASM_ARM64_POR_H
> +
> +#define POR_BITS_PER_PKEY		4
> +#define POR_ELx_IDX(por_elx, idx)	(((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf)

Nit: (idx)

Since this is shared with other code in a header, it's probably best
to avoid surprises.

[...]

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 495b732d5af3..e50ccc86d150 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -25,6 +25,7 @@
>  #include <linux/vmalloc.h>
>  #include <linux/set_memory.h>
>  #include <linux/kfence.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/barrier.h>
>  #include <asm/cputype.h>
> @@ -1535,3 +1536,42 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
>  
>  	cpu_uninstall_idmap();
>  }
> +
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
> +{
> +	u64 new_por = POE_RXW;
> +	u64 old_por;
> +	u64 pkey_shift;
> +
> +	if (!arch_pkeys_enabled())
> +		return -ENOSPC;
> +
> +	/*
> +	 * This code should only be called with valid 'pkey'
> +	 * values originating from in-kernel users.  Complain
> +	 * if a bad value is observed.
> +	 */
> +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> +		return -EINVAL;
> +
> +	/* Set the bits we need in POR:  */
> +	if (init_val & PKEY_DISABLE_ACCESS)
> +		new_por = POE_X;
> +	else if (init_val & PKEY_DISABLE_WRITE)
> +		new_por = POE_RX;
> +
> +	/* Shift the bits in to the correct place in POR for pkey: */
> +	pkey_shift = pkey * POR_BITS_PER_PKEY;
> +	new_por <<= pkey_shift;
> +
> +	/* Get old POR and mask off any old bits in place: */
> +	old_por = read_sysreg_s(SYS_POR_EL0);
> +	old_por &= ~(POE_MASK << pkey_shift);
> +
> +	/* Write old part along with new part: */
> +	write_sysreg_s(old_por | new_por, SYS_POR_EL0);
> +
> +	return 0;
> +}
> +#endif

<bikeshed>

Although this is part of the existing PKEYS support, it feels weird to
have to initialise the permissions with one interface and one set of
flags, then change the permissions using an arch-specific interface and
a different set of flags (i.e., directly writing POR_EL0) later on.

Is there any merit in defining a vDSO function for changing the flags in
userspace?  This would allow userspace to use PKEYS in a generic way
without a nasty per-arch volatile asm hack.  

(Maybe too late for stopping user libraries rolling their own, though.)


Since we ideally don't want to write the above flags-mungeing code
twice, there would be the option of implementing pkey_alloc() via a
vDSO wrapper on arm64 (though this might be more trouble than it is
worth).

Of course, this is all pointless if people thought that even the
overhead of a vDSO call was unacceptable for flipping the permissions
on and off.  Either way, this is a potential enhancement, orthogonal to
this series...

</bikeshed>

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-25 15:58         ` Dave Martin
@ 2024-07-25 18:11           ` Mark Brown
  2024-07-26 16:14             ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Mark Brown @ 2024-07-25 18:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: Amit Daniel Kachhap, Joey Gouly, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, catalin.marinas, christophe.leroy,
	dave.hansen, hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz,
	mingo, mpe, naveen.n.rao, npiggin, oliver.upton, shuah,
	szabolcs.nagy, tglx, will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

On Thu, Jul 25, 2024 at 04:58:27PM +0100, Dave Martin wrote:

> I'll post a draft patch separately, since I think the update could
> benefit from separate discussion, but my back-of-the-envelope
> calculation suggests that (before this patch) we are down to 0x90
> bytes of free space (i.e., over 96% full).

> I wonder whether it is time to start pushing back on adding a new
> _foo_context for every individual register, though?

> Maybe we could add some kind of _misc_context for miscellaneous 64-bit
> regs.

That'd have to be a variably sized structure with pairs of sysreg
ID/value items in it I think which would be a bit of a pain to implement
but doable.  The per-record header is 64 bits, we'd get maximal saving
by allocating a byte for the IDs.

It would be very unfortunate timing to start gating things on such a
change though (I'm particularly worried about GCS here, at this point
the kernel changes are blocking the entire ecosystem).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-25 18:11           ` Mark Brown
@ 2024-07-26 16:14             ` Dave Martin
  2024-07-26 17:39               ` Mark Brown
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-26 16:14 UTC (permalink / raw)
  To: Mark Brown
  Cc: Amit Daniel Kachhap, Joey Gouly, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, catalin.marinas, christophe.leroy,
	dave.hansen, hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz,
	mingo, mpe, naveen.n.rao, npiggin, oliver.upton, shuah,
	szabolcs.nagy, tglx, will, x86, kvmarm

On Thu, Jul 25, 2024 at 07:11:41PM +0100, Mark Brown wrote:
> On Thu, Jul 25, 2024 at 04:58:27PM +0100, Dave Martin wrote:
> 
> > I'll post a draft patch separately, since I think the update could
> > benefit from separate discussion, but my back-of-the-envelope
> > calculation suggests that (before this patch) we are down to 0x90
> > bytes of free space (i.e., over 96% full).
> 
> > I wonder whether it is time to start pushing back on adding a new
> > _foo_context for every individual register, though?
> 
> > Maybe we could add some kind of _misc_context for miscellaneous 64-bit
> > regs.
> 
> That'd have to be a variably sized structure with pairs of sysreg
> ID/value items in it I think which would be a bit of a pain to implement
> but doable.  The per-record header is 64 bits, we'd get maximal saving
> by allocating a byte for the IDs.

Or possibly the regs could be identified positionally, avoiding the
need for IDs.  Space would be at a premium, and we would have to think
carefully about what should and should not be allowed in there.

> It would be very unfortunate timing to start gating things on such a
> change though (I'm particularly worried about GCS here, at this point
> the kernel changes are blocking the entire ecosystem).

For GCS, I wonder whether it should be made a strictly opt-in feature:
i.e., if you use it then you must tolerate large sigframes, and if it
is turned off then its state is neither dumped nor restored.  Since GCS
requires an explict prctl to turn it on, the mechanism seems partly
there already in your series.

I guess the GCS thread is the better place to discuss that, though.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-26 16:14             ` Dave Martin
@ 2024-07-26 17:39               ` Mark Brown
  2024-07-29 14:27                 ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Mark Brown @ 2024-07-26 17:39 UTC (permalink / raw)
  To: Dave Martin
  Cc: Amit Daniel Kachhap, Joey Gouly, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, catalin.marinas, christophe.leroy,
	dave.hansen, hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz,
	mingo, mpe, naveen.n.rao, npiggin, oliver.upton, shuah,
	szabolcs.nagy, tglx, will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 1502 bytes --]

On Fri, Jul 26, 2024 at 05:14:01PM +0100, Dave Martin wrote:
> On Thu, Jul 25, 2024 at 07:11:41PM +0100, Mark Brown wrote:

> > That'd have to be a variably sized structure with pairs of sysreg
> > ID/value items in it I think which would be a bit of a pain to implement
> > but doable.  The per-record header is 64 bits, we'd get maximal saving
> > by allocating a byte for the IDs.

> Or possibly the regs could be identified positionally, avoiding the
> need for IDs.  Space would be at a premium, and we would have to think
> carefully about what should and should not be allowed in there.

Yes, though that would mean if we had to generate any register in there
we'd always have to generate at least as many entries as whatever number
it got assigned which depending on how much optionality ends up getting
used might be unfortunate.

> > It would be very unfortunate timing to start gating things on such a
> > change though (I'm particularly worried about GCS here, at this point
> > the kernel changes are blocking the entire ecosystem).

> For GCS, I wonder whether it should be made a strictly opt-in feature:
> i.e., if you use it then you must tolerate large sigframes, and if it
> is turned off then its state is neither dumped nor restored.  Since GCS
> requires an explict prctl to turn it on, the mechanism seems partly
> there already in your series.

Yeah, that's what the current code does actually.  In any case it's not
just a single register - there's also the GCS mode in there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-26 17:39               ` Mark Brown
@ 2024-07-29 14:27                 ` Dave Martin
  2024-07-29 14:41                   ` Mark Brown
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-07-29 14:27 UTC (permalink / raw)
  To: Mark Brown
  Cc: Amit Daniel Kachhap, Joey Gouly, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, catalin.marinas, christophe.leroy,
	dave.hansen, hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz,
	mingo, mpe, naveen.n.rao, npiggin, oliver.upton, shuah,
	szabolcs.nagy, tglx, will, x86, kvmarm

On Fri, Jul 26, 2024 at 06:39:27PM +0100, Mark Brown wrote:
> On Fri, Jul 26, 2024 at 05:14:01PM +0100, Dave Martin wrote:
> > On Thu, Jul 25, 2024 at 07:11:41PM +0100, Mark Brown wrote:
> 
> > > That'd have to be a variably sized structure with pairs of sysreg
> > > ID/value items in it I think which would be a bit of a pain to implement
> > > but doable.  The per-record header is 64 bits, we'd get maximal saving
> > > by allocating a byte for the IDs.
> 
> > Or possibly the regs could be identified positionally, avoiding the
> > need for IDs.  Space would be at a premium, and we would have to think
> > carefully about what should and should not be allowed in there.
> 
> Yes, though that would mean if we had to generate any register in there
> we'd always have to generate at least as many entries as whatever number
> it got assigned which depending on how much optionality ends up getting
> used might be unfortunate.

Ack, though it's only 150 bytes or so at most, so just zeroing it all
(or as much as we know about) doesn't feel like a big cost.

It depends how determined we are to squeeze the most out of the
remaining space.


> > > It would be very unfortunate timing to start gating things on such a
> > > change though (I'm particularly worried about GCS here, at this point
> > > the kernel changes are blocking the entire ecosystem).
> 
> > For GCS, I wonder whether it should be made a strictly opt-in feature:
> > i.e., if you use it then you must tolerate large sigframes, and if it
> > is turned off then its state is neither dumped nor restored.  Since GCS
> > requires an explict prctl to turn it on, the mechanism seems partly
> > there already in your series.
> 
> Yeah, that's what the current code does actually.  In any case it's not
> just a single register - there's also the GCS mode in there.

Agreed -- I'll ping the GCS series, but this sounds like a reasonable
starting point.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-29 14:27                 ` Dave Martin
@ 2024-07-29 14:41                   ` Mark Brown
  0 siblings, 0 replies; 146+ messages in thread
From: Mark Brown @ 2024-07-29 14:41 UTC (permalink / raw)
  To: Dave Martin
  Cc: Amit Daniel Kachhap, Joey Gouly, linux-arm-kernel, akpm,
	aneesh.kumar, aneesh.kumar, bp, catalin.marinas, christophe.leroy,
	dave.hansen, hpa, linux-fsdevel, linux-mm, linuxppc-dev, maz,
	mingo, mpe, naveen.n.rao, npiggin, oliver.upton, shuah,
	szabolcs.nagy, tglx, will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 724 bytes --]

On Mon, Jul 29, 2024 at 03:27:11PM +0100, Dave Martin wrote:
> On Fri, Jul 26, 2024 at 06:39:27PM +0100, Mark Brown wrote:

> > Yes, though that would mean if we had to generate any register in there
> > we'd always have to generate at least as many entries as whatever number
> > it got assigned which depending on how much optionality ends up getting
> > used might be unfortunate.

> Ack, though it's only 150 bytes or so at most, so just zeroing it all
> (or as much as we know about) doesn't feel like a big cost.

> It depends how determined we are to squeeze the most out of the
> remaining space.

Indeed, I was more thinking about how it might scale as the number of
extensions grows rather than the current costs.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-07-25 15:49   ` Dave Martin
@ 2024-08-01 10:55     ` Joey Gouly
  2024-08-01 11:01       ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-01 10:55 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 25, 2024 at 04:49:50PM +0100, Dave Martin wrote:
> On Fri, May 03, 2024 at 02:01:31PM +0100, Joey Gouly wrote:
> > Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> > value is set in the vm_flags and then into the pgprot value.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/mman.h | 8 +++++++-
> >  arch/arm64/mm/mmap.c          | 9 +++++++++
> >  2 files changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > index 5966ee4a6154..ecb2d18dc4d7 100644
> > --- a/arch/arm64/include/asm/mman.h
> > +++ b/arch/arm64/include/asm/mman.h
> > @@ -7,7 +7,7 @@
> >  #include <uapi/asm/mman.h>
> >  
> >  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > -	unsigned long pkey __always_unused)
> > +	unsigned long pkey)
> >  {
> >  	unsigned long ret = 0;
> >  
> > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >  	if (system_supports_mte() && (prot & PROT_MTE))
> >  		ret |= VM_MTE;
> >  
> > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> > +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> > +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> > +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> 
> Out of interest, is this as bad as it looks or does the compiler turn
> it into a shift and mask?

Yeah, (gcc 13.2) produces good code here (this is do_mprotect_pkey after removing a lot of branching):

	and     w0, w0, #0x7
	orr     x1, x1, x0, lsl #32

> 
> 
> > +#endif
> > +
> >  	return ret;
> >  }
> >  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> > diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
> > index 642bdf908b22..86eda6bc7893 100644
> > --- a/arch/arm64/mm/mmap.c
> > +++ b/arch/arm64/mm/mmap.c
> > @@ -102,6 +102,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
> >  	if (vm_flags & VM_MTE)
> >  		prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
> >  
> > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > +	if (vm_flags & VM_PKEY_BIT0)
> > +		prot |= PTE_PO_IDX_0;
> > +	if (vm_flags & VM_PKEY_BIT1)
> > +		prot |= PTE_PO_IDX_1;
> > +	if (vm_flags & VM_PKEY_BIT2)
> > +		prot |= PTE_PO_IDX_2;
> > +#endif
> > +
> 
> Ditto.  At least we only have three bits to cope with either way.
> 
> I'm guessing that these functions are not super-hot path.
> 
> [...]
> 
> Cheers
> ---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values
  2024-08-01 10:55     ` Joey Gouly
@ 2024-08-01 11:01       ` Dave Martin
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-01 11:01 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Aug 01, 2024 at 11:55:02AM +0100, Joey Gouly wrote:
> On Thu, Jul 25, 2024 at 04:49:50PM +0100, Dave Martin wrote:
> > On Fri, May 03, 2024 at 02:01:31PM +0100, Joey Gouly wrote:
> > > Modify arch_calc_vm_prot_bits() and vm_get_page_prot() such that the pkey
> > > value is set in the vm_flags and then into the pgprot value.
> > > 
> > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > >  arch/arm64/include/asm/mman.h | 8 +++++++-
> > >  arch/arm64/mm/mmap.c          | 9 +++++++++
> > >  2 files changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
> > > index 5966ee4a6154..ecb2d18dc4d7 100644
> > > --- a/arch/arm64/include/asm/mman.h
> > > +++ b/arch/arm64/include/asm/mman.h
> > > @@ -7,7 +7,7 @@
> > >  #include <uapi/asm/mman.h>
> > >  
> > >  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > -	unsigned long pkey __always_unused)
> > > +	unsigned long pkey)
> > >  {
> > >  	unsigned long ret = 0;
> > >  
> > > @@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > >  	if (system_supports_mte() && (prot & PROT_MTE))
> > >  		ret |= VM_MTE;
> > >  
> > > +#if defined(CONFIG_ARCH_HAS_PKEYS)
> > > +	ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
> > > +	ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
> > > +	ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
> > 
> > Out of interest, is this as bad as it looks or does the compiler turn
> > it into a shift and mask?
> 
> Yeah, (gcc 13.2) produces good code here (this is do_mprotect_pkey after removing a lot of branching):
> 
> 	and     w0, w0, #0x7
> 	orr     x1, x1, x0, lsl #32

Neat, good ol' gcc!

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-07-25 16:00   ` Dave Martin
@ 2024-08-01 15:54     ` Joey Gouly
  2024-08-01 16:22       ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-01 15:54 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 25, 2024 at 05:00:18PM +0100, Dave Martin wrote:
> Hi,
> 
> On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> > Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Reviewed-by: Mark Brown <broonie@kernel.org>
> > Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > ---
> >  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
> >  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
> >  2 files changed, 59 insertions(+)
> > 
> > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> > index 8a45b7a411e0..e4cba8a6c9a2 100644
> > --- a/arch/arm64/include/uapi/asm/sigcontext.h
> > +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> 
> [...]
> 
> > @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> >  			return err;
> >  	}
> >  
> > +	if (system_supports_poe()) {
> > +		err = sigframe_alloc(user, &user->poe_offset,
> > +				     sizeof(struct poe_context));
> > +		if (err)
> > +			return err;
> > +	}
> > +
> >  	return sigframe_alloc_end(user);
> >  }
> >  
> > @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> >  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
> >  	}
> >  
> > +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> > +		struct poe_context __user *poe_ctx =
> > +			apply_user_offset(user, user->poe_offset);
> > +
> > +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> > +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> > +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> > +	}
> > +
> 
> Does the AArch64 procedure call standard say anything about whether
> POR_EL0 is caller-saved?

I asked about this, and it doesn't say anything and they don't plan on it,
since it's very application specific.

> 
> <bikeshed>
> 
> In theory we could skip saving this register if it is already
> POR_EL0_INIT (which it often will be), and if the signal handler is not
> supposed to modify and leave the modified value in the register when
> returning.
> 
> The complexity of the additional check my be a bit pointless though,
> and the the handler might theoretically want to change the interrupted
> code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
> sometimes there and sometimes not.
> 
> </bikeshed>
> 
I think trying to skip/optimise something here would be more effort than any
possible benefits!

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-07-25 15:57   ` Dave Martin
@ 2024-08-01 16:01     ` Joey Gouly
  2024-08-06 13:33       ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-01 16:01 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 25, 2024 at 04:57:09PM +0100, Dave Martin wrote:
> On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> > If a memory fault occurs that is due to an overlay/pkey fault, report that to
> > userspace with a SEGV_PKUERR.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/traps.h |  1 +
> >  arch/arm64/kernel/traps.c      | 12 ++++++--
> >  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
> >  3 files changed, 64 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> > index eefe766d6161..f6f6f2cb7f10 100644
> > --- a/arch/arm64/include/asm/traps.h
> > +++ b/arch/arm64/include/asm/traps.h
> > @@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
> >  void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
> >  void arm64_notify_segfault(unsigned long addr);
> >  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
> >  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
> >  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> >  
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index 215e6d7f2df8..1bac6c84d3f5 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
> >  	__show_regs(regs);
> >  }
> >  
> > -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > -			   const char *str)
> > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > +			   const char *str, int pkey)
> >  {
> >  	arm64_show_signal(signo, str);
> >  	if (signo == SIGKILL)
> >  		force_sig(SIGKILL);
> > +	else if (code == SEGV_PKUERR)
> > +		force_sig_pkuerr((void __user *)far, pkey);
> 
> Is signo definitely SIGSEGV here?  It looks to me like we can get in
> here for SIGBUS, SIGTRAP etc.
> 
> si_codes are not unique between different signo here, so I'm wondering
> whether this should this be:
> 
> 	else if (signo == SIGSEGV && code == SEGV_PKUERR)
> 
> ...?
> 
> 
> >  	else
> >  		force_sig_fault(signo, code, (void __user *)far);
> >  }
> >  
> > +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > +			   const char *str)
> > +{
> > +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
> 
> Is there a reason not to follow the same convention as elsewhere, where
> -1 is passed for "no pkey"?
> 
> If we think this should never be called with signo == SIGSEGV &&
> code == SEGV_PKUERR and no valid pkey but if it's messy to prove, then
> maybe a WARN_ON_ONCE() would be worth it here?
> 

Anshuman suggested to separate them out, which I did like below, I think that
addresses your comments too?

diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
index 215e6d7f2df8..49bac9ae04c0 100644
--- arch/arm64/kernel/traps.c
+++ arch/arm64/kernel/traps.c
@@ -273,6 +273,13 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
                force_sig_fault(signo, code, (void __user *)far);
 }
 
+void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
+                          const char *str, int pkey)
+{
+       arm64_show_signal(signo, str);
+       force_sig_pkuerr((void __user *)far, pkey);
+}
+
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
                            const char *str)
 {

diff --git arch/arm64/mm/fault.c arch/arm64/mm/fault.c
index 451ba7cbd5ad..1ddd46b97f88 100644
--- arch/arm64/mm/fault.c
+++ arch/arm64/mm/fault.c

-               arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
+               if (si_code == SEGV_PKUERR)
+                       arm64_force_sig_fault_pkey(SIGSEGV, si_code, far, inf->name, pkey);
+               else
+                       arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);


Thanks,
Joey

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-07-25 15:49   ` Dave Martin
@ 2024-08-01 16:04     ` Joey Gouly
  2024-08-01 16:31       ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-01 16:04 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 25, 2024 at 04:49:08PM +0100, Dave Martin wrote:
> On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote:
> > Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
> > check if the CPU supports the feature.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > This takes the last bit of HWCAP2, is this fine? What can we do about more features in the future?
> > 
> > 
> >  Documentation/arch/arm64/elf_hwcaps.rst |  2 ++
> >  arch/arm64/include/asm/hwcap.h          |  1 +
> >  arch/arm64/include/uapi/asm/hwcap.h     |  1 +
> >  arch/arm64/kernel/cpufeature.c          | 14 ++++++++++++++
> >  arch/arm64/kernel/cpuinfo.c             |  1 +
> >  5 files changed, 19 insertions(+)
> > 
> > diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> > index 448c1664879b..694f67fa07d1 100644
> > --- a/Documentation/arch/arm64/elf_hwcaps.rst
> > +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> > @@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
> >  HWCAP2_SME_SF8DP4
> >      Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
> >  
> > +HWCAP2_POE
> > +    Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
> 
> Nit: unintentionally dropped blank line before the section heading?

Now there's only one blank line, I think
c1932cac7902a8b0f7355515917dedc5412eb15d unintentionally added 2 blank lines,
before that it was always 1!

> 
> >  
> >  4. Unused AT_HWCAP bits
> >  -----------------------
> 
> [...]
> 
> Cheers
> ---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-01 15:54     ` Joey Gouly
@ 2024-08-01 16:22       ` Dave Martin
  2024-08-06 10:35         ` Joey Gouly
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-08-01 16:22 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Aug 01, 2024 at 04:54:41PM +0100, Joey Gouly wrote:
> On Thu, Jul 25, 2024 at 05:00:18PM +0100, Dave Martin wrote:
> > Hi,
> > 
> > On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> > > Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> > > 
> > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Reviewed-by: Mark Brown <broonie@kernel.org>
> > > Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > > ---
> > >  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
> > >  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
> > >  2 files changed, 59 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> > > index 8a45b7a411e0..e4cba8a6c9a2 100644
> > > --- a/arch/arm64/include/uapi/asm/sigcontext.h
> > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> > 
> > [...]
> > 
> > > @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> > >  			return err;
> > >  	}
> > >  
> > > +	if (system_supports_poe()) {
> > > +		err = sigframe_alloc(user, &user->poe_offset,
> > > +				     sizeof(struct poe_context));
> > > +		if (err)
> > > +			return err;
> > > +	}
> > > +
> > >  	return sigframe_alloc_end(user);
> > >  }
> > >  
> > > @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> > >  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
> > >  	}
> > >  
> > > +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> > > +		struct poe_context __user *poe_ctx =
> > > +			apply_user_offset(user, user->poe_offset);
> > > +
> > > +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> > > +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> > > +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> > > +	}
> > > +
> > 
> > Does the AArch64 procedure call standard say anything about whether
> > POR_EL0 is caller-saved?
> 
> I asked about this, and it doesn't say anything and they don't plan on it,
> since it's very application specific.

Right.  I think that confirms that we don't absolutely need to preserve
POR_EL0, because if compiler-generated code was allowed to fiddle with
this and not clean up after itself, the PCS would have to document this.

> > 
> > <bikeshed>
> > 
> > In theory we could skip saving this register if it is already
> > POR_EL0_INIT (which it often will be), and if the signal handler is not
> > supposed to modify and leave the modified value in the register when
> > returning.
> > 
> > The complexity of the additional check my be a bit pointless though,
> > and the the handler might theoretically want to change the interrupted
> > code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
> > sometimes there and sometimes not.
> > 
> > </bikeshed>
> > 
> I think trying to skip/optimise something here would be more effort than any
> possible benefits!

Actually, having thought about this some more I think that only dumping
this register if != POR_EL0_INIT may be right right thing to do.

This would mean that old binary would stacks never see poe_context in
the signal frame, and so will never experience unexpected stack
overruns (at least, not due solely to the presence of this feature).

POE-aware signal handlers have to do something fiddly and nonportable
to obtain the original value of POR_EL0 regardless, so requiring them
do handle both cases (present in sigframe and absent) doesn't seem too
onerous to me.


Do you think this approach would break any known use cases?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0
  2024-08-01 16:04     ` Joey Gouly
@ 2024-08-01 16:31       ` Dave Martin
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-01 16:31 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Aug 01, 2024 at 05:04:03PM +0100, Joey Gouly wrote:
> On Thu, Jul 25, 2024 at 04:49:08PM +0100, Dave Martin wrote:
> > On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote:
> > > Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
> > > check if the CPU supports the feature.
> > > 
> > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > This takes the last bit of HWCAP2, is this fine? What can we do about more features in the future?
> > > 
> > > 
> > >  Documentation/arch/arm64/elf_hwcaps.rst |  2 ++
> > >  arch/arm64/include/asm/hwcap.h          |  1 +
> > >  arch/arm64/include/uapi/asm/hwcap.h     |  1 +
> > >  arch/arm64/kernel/cpufeature.c          | 14 ++++++++++++++
> > >  arch/arm64/kernel/cpuinfo.c             |  1 +
> > >  5 files changed, 19 insertions(+)
> > > 
> > > diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> > > index 448c1664879b..694f67fa07d1 100644
> > > --- a/Documentation/arch/arm64/elf_hwcaps.rst
> > > +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> > > @@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
> > >  HWCAP2_SME_SF8DP4
> > >      Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
> > >  
> > > +HWCAP2_POE
> > > +    Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
> > 
> > Nit: unintentionally dropped blank line before the section heading?
> 
> Now there's only one blank line, I think
> c1932cac7902a8b0f7355515917dedc5412eb15d unintentionally added 2 blank lines,
> before that it was always 1!

Hmmm, true.  Not a big deal, I guess.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2
  2024-07-25 15:44   ` Dave Martin
@ 2024-08-06 10:04     ` Joey Gouly
  0 siblings, 0 replies; 146+ messages in thread
From: Joey Gouly @ 2024-08-06 10:04 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Jul 25, 2024 at 04:44:13PM +0100, Dave Martin wrote:
> Hi,
> 
> On Fri, May 03, 2024 at 02:01:22PM +0100, Joey Gouly wrote:
> > Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> >  arch/arm64/include/asm/el2_setup.h | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> > index b7afaa026842..df5614be4b70 100644
> > --- a/arch/arm64/include/asm/el2_setup.h
> > +++ b/arch/arm64/include/asm/el2_setup.h
> > @@ -184,12 +184,20 @@
> >  .Lset_pie_fgt_\@:
> >  	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> >  	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
> > -	cbz	x1, .Lset_fgt_\@
> > +	cbz	x1, .Lset_poe_fgt_\@
> >  
> >  	/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
> >  	orr	x0, x0, #HFGxTR_EL2_nPIR_EL1
> >  	orr	x0, x0, #HFGxTR_EL2_nPIRE0_EL1
> >  
> > +.Lset_poe_fgt_\@:
> > +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> > +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
> > +	cbz	x1, .Lset_fgt_\@
> > +
> > +	/* Disable trapping of POR_EL0 */
> > +	orr	x0, x0, #HFGxTR_EL2_nPOR_EL0
> 
> Do I understand correctly that this is just to allow the host to access
> its own POR_EL0, before (or unless) KVM starts up?

Yup.

> 
> KVM always overrides all the EL2 trap controls while running a guest,
> right?  We don't want this bit still set when running in a guest just
> because KVM doesn't know about POE yet.

KVM currently unconditionally traps POE regs currently, this series makes that
conditional.

> 
> (Hopefully this follows naturally from the way the KVM code works, but
> my KVM-fu is a bit rusty.)
> 
> Also, what about POR_EL1?  Do we have to reset that to something sane
> (and so untrap it here), or it is sufficient if we never turn on POE
> support in the host, via TCR2_EL1.POE?

Since the host isn't using it, we don't need to reset it. It will be reset to an unknown value for guests.

In patch 7:

+	{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },

> 
> [...]
> 
> Cheers
> ---Dave

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-01 16:22       ` Dave Martin
@ 2024-08-06 10:35         ` Joey Gouly
  2024-08-06 14:31           ` Joey Gouly
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-06 10:35 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Aug 01, 2024 at 05:22:45PM +0100, Dave Martin wrote:
> On Thu, Aug 01, 2024 at 04:54:41PM +0100, Joey Gouly wrote:
> > On Thu, Jul 25, 2024 at 05:00:18PM +0100, Dave Martin wrote:
> > > Hi,
> > > 
> > > On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> > > > Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> > > > 
> > > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > Reviewed-by: Mark Brown <broonie@kernel.org>
> > > > Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > > > ---
> > > >  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
> > > >  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
> > > >  2 files changed, 59 insertions(+)
> > > > 
> > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> > > > index 8a45b7a411e0..e4cba8a6c9a2 100644
> > > > --- a/arch/arm64/include/uapi/asm/sigcontext.h
> > > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> > > 
> > > [...]
> > > 
> > > > @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> > > >  			return err;
> > > >  	}
> > > >  
> > > > +	if (system_supports_poe()) {
> > > > +		err = sigframe_alloc(user, &user->poe_offset,
> > > > +				     sizeof(struct poe_context));
> > > > +		if (err)
> > > > +			return err;
> > > > +	}
> > > > +
> > > >  	return sigframe_alloc_end(user);
> > > >  }
> > > >  
> > > > @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> > > >  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
> > > >  	}
> > > >  
> > > > +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> > > > +		struct poe_context __user *poe_ctx =
> > > > +			apply_user_offset(user, user->poe_offset);
> > > > +
> > > > +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> > > > +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> > > > +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> > > > +	}
> > > > +
> > > 
> > > Does the AArch64 procedure call standard say anything about whether
> > > POR_EL0 is caller-saved?
> > 
> > I asked about this, and it doesn't say anything and they don't plan on it,
> > since it's very application specific.
> 
> Right.  I think that confirms that we don't absolutely need to preserve
> POR_EL0, because if compiler-generated code was allowed to fiddle with
> this and not clean up after itself, the PCS would have to document this.
> 
> > > 
> > > <bikeshed>
> > > 
> > > In theory we could skip saving this register if it is already
> > > POR_EL0_INIT (which it often will be), and if the signal handler is not
> > > supposed to modify and leave the modified value in the register when
> > > returning.
> > > 
> > > The complexity of the additional check my be a bit pointless though,
> > > and the the handler might theoretically want to change the interrupted
> > > code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
> > > sometimes there and sometimes not.
> > > 
> > > </bikeshed>
> > > 
> > I think trying to skip/optimise something here would be more effort than any
> > possible benefits!
> 
> Actually, having thought about this some more I think that only dumping
> this register if != POR_EL0_INIT may be right right thing to do.
> 
> This would mean that old binary would stacks never see poe_context in
> the signal frame, and so will never experience unexpected stack
> overruns (at least, not due solely to the presence of this feature).

They can already see things they don't expect, like FPMR that was added
recently.

> 
> POE-aware signal handlers have to do something fiddly and nonportable
> to obtain the original value of POR_EL0 regardless, so requiring them
> do handle both cases (present in sigframe and absent) doesn't seem too
> onerous to me.

If the signal handler wanted to modify the value, from the default, wouldn't it
need to mess around with the sig context stuff, to allocate some space for
POR_EL0, such that the kernel would restore it properly? (If that's even
possible).

> 
> 
> Do you think this approach would break any known use cases?

Not sure.

> 
> Cheers
> ---Dave
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-08-01 16:01     ` Joey Gouly
@ 2024-08-06 13:33       ` Dave Martin
  2024-08-06 13:43         ` Joey Gouly
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-08-06 13:33 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Thu, Aug 01, 2024 at 05:01:10PM +0100, Joey Gouly wrote:
> On Thu, Jul 25, 2024 at 04:57:09PM +0100, Dave Martin wrote:
> > On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> > > If a memory fault occurs that is due to an overlay/pkey fault, report that to
> > > userspace with a SEGV_PKUERR.
> > > 
> > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > >  arch/arm64/include/asm/traps.h |  1 +
> > >  arch/arm64/kernel/traps.c      | 12 ++++++--
> > >  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
> > >  3 files changed, 64 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> > > index eefe766d6161..f6f6f2cb7f10 100644
> > > --- a/arch/arm64/include/asm/traps.h
> > > +++ b/arch/arm64/include/asm/traps.h
> > > @@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
> > >  void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
> > >  void arm64_notify_segfault(unsigned long addr);
> > >  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
> > >  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
> > >  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> > >  
> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index 215e6d7f2df8..1bac6c84d3f5 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
> > >  	__show_regs(regs);
> > >  }
> > >  
> > > -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > -			   const char *str)
> > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > > +			   const char *str, int pkey)
> > >  {
> > >  	arm64_show_signal(signo, str);
> > >  	if (signo == SIGKILL)
> > >  		force_sig(SIGKILL);
> > > +	else if (code == SEGV_PKUERR)
> > > +		force_sig_pkuerr((void __user *)far, pkey);
> > 
> > Is signo definitely SIGSEGV here?  It looks to me like we can get in
> > here for SIGBUS, SIGTRAP etc.
> > 
> > si_codes are not unique between different signo here, so I'm wondering
> > whether this should this be:
> > 
> > 	else if (signo == SIGSEGV && code == SEGV_PKUERR)
> > 
> > ...?
> > 
> > 
> > >  	else
> > >  		force_sig_fault(signo, code, (void __user *)far);
> > >  }
> > >  
> > > +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > +			   const char *str)
> > > +{
> > > +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
> > 
> > Is there a reason not to follow the same convention as elsewhere, where
> > -1 is passed for "no pkey"?
> > 
> > If we think this should never be called with signo == SIGSEGV &&
> > code == SEGV_PKUERR and no valid pkey but if it's messy to prove, then
> > maybe a WARN_ON_ONCE() would be worth it here?
> > 
> 
> Anshuman suggested to separate them out, which I did like below, I think that
> addresses your comments too?
> 
> diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
> index 215e6d7f2df8..49bac9ae04c0 100644
> --- arch/arm64/kernel/traps.c
> +++ arch/arm64/kernel/traps.c
> @@ -273,6 +273,13 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
>                 force_sig_fault(signo, code, (void __user *)far);
>  }
>  
> +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> +                          const char *str, int pkey)
> +{
> +       arm64_show_signal(signo, str);
> +       force_sig_pkuerr((void __user *)far, pkey);
> +}
> +
>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
>                             const char *str)
>  {
> 
> diff --git arch/arm64/mm/fault.c arch/arm64/mm/fault.c
> index 451ba7cbd5ad..1ddd46b97f88 100644
> --- arch/arm64/mm/fault.c
> +++ arch/arm64/mm/fault.c

(Guessing where this is means to apply, since there is no hunk header
or context...)

> 
> -               arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
> +               if (si_code == SEGV_PKUERR)
> +                       arm64_force_sig_fault_pkey(SIGSEGV, si_code, far, inf->name, pkey);

Maybe drop the the signo and si_code argument?  This would mean that
arm64_force_sig_fault_pkey() can't be called with a signo/si_code
combination that makes no sense.

I think pkey faults are always going to be SIGSEGV/SEGV_PKUERR, right?
Or are there other combinations that can apply for these faults?


> +               else
> +                       arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);

Otherwise yes, I think splitting things this way makes sense.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-08-06 13:33       ` Dave Martin
@ 2024-08-06 13:43         ` Joey Gouly
  2024-08-06 14:38           ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-06 13:43 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 06, 2024 at 02:33:37PM +0100, Dave Martin wrote:
> Hi,
> 
> On Thu, Aug 01, 2024 at 05:01:10PM +0100, Joey Gouly wrote:
> > On Thu, Jul 25, 2024 at 04:57:09PM +0100, Dave Martin wrote:
> > > On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> > > > If a memory fault occurs that is due to an overlay/pkey fault, report that to
> > > > userspace with a SEGV_PKUERR.
> > > > 
> > > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > ---
> > > >  arch/arm64/include/asm/traps.h |  1 +
> > > >  arch/arm64/kernel/traps.c      | 12 ++++++--
> > > >  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
> > > >  3 files changed, 64 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> > > > index eefe766d6161..f6f6f2cb7f10 100644
> > > > --- a/arch/arm64/include/asm/traps.h
> > > > +++ b/arch/arm64/include/asm/traps.h
> > > > @@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
> > > >  void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
> > > >  void arm64_notify_segfault(unsigned long addr);
> > > >  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> > > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey);
> > > >  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
> > > >  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> > > >  
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index 215e6d7f2df8..1bac6c84d3f5 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
> > > >  	__show_regs(regs);
> > > >  }
> > > >  
> > > > -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > > -			   const char *str)
> > > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > > > +			   const char *str, int pkey)
> > > >  {
> > > >  	arm64_show_signal(signo, str);
> > > >  	if (signo == SIGKILL)
> > > >  		force_sig(SIGKILL);
> > > > +	else if (code == SEGV_PKUERR)
> > > > +		force_sig_pkuerr((void __user *)far, pkey);
> > > 
> > > Is signo definitely SIGSEGV here?  It looks to me like we can get in
> > > here for SIGBUS, SIGTRAP etc.
> > > 
> > > si_codes are not unique between different signo here, so I'm wondering
> > > whether this should this be:
> > > 
> > > 	else if (signo == SIGSEGV && code == SEGV_PKUERR)
> > > 
> > > ...?
> > > 
> > > 
> > > >  	else
> > > >  		force_sig_fault(signo, code, (void __user *)far);
> > > >  }
> > > >  
> > > > +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > > +			   const char *str)
> > > > +{
> > > > +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
> > > 
> > > Is there a reason not to follow the same convention as elsewhere, where
> > > -1 is passed for "no pkey"?
> > > 
> > > If we think this should never be called with signo == SIGSEGV &&
> > > code == SEGV_PKUERR and no valid pkey but if it's messy to prove, then
> > > maybe a WARN_ON_ONCE() would be worth it here?
> > > 
> > 
> > Anshuman suggested to separate them out, which I did like below, I think that
> > addresses your comments too?
> > 
> > diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
> > index 215e6d7f2df8..49bac9ae04c0 100644
> > --- arch/arm64/kernel/traps.c
> > +++ arch/arm64/kernel/traps.c
> > @@ -273,6 +273,13 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
> >                 force_sig_fault(signo, code, (void __user *)far);
> >  }
> >  
> > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > +                          const char *str, int pkey)
> > +{
> > +       arm64_show_signal(signo, str);
> > +       force_sig_pkuerr((void __user *)far, pkey);
> > +}
> > +
> >  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
> >                             const char *str)
> >  {
> > 
> > diff --git arch/arm64/mm/fault.c arch/arm64/mm/fault.c
> > index 451ba7cbd5ad..1ddd46b97f88 100644
> > --- arch/arm64/mm/fault.c
> > +++ arch/arm64/mm/fault.c
> 
> (Guessing where this is means to apply, since there is no hunk header
> or context...)

Sorry I had some other changes and just mashed the bits into a diff-looking-thing.

> 
> > 
> > -               arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
> > +               if (si_code == SEGV_PKUERR)
> > +                       arm64_force_sig_fault_pkey(SIGSEGV, si_code, far, inf->name, pkey);
> 
> Maybe drop the the signo and si_code argument?  This would mean that
> arm64_force_sig_fault_pkey() can't be called with a signo/si_code
> combination that makes no sense.
> 
> I think pkey faults are always going to be SIGSEGV/SEGV_PKUERR, right?
> Or are there other combinations that can apply for these faults?

Ah yes, I can simplify it even more, thanks.

diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
index 49bac9ae04c0..d9abb8b390c0 100644
--- arch/arm64/kernel/traps.c
+++ arch/arm64/kernel/traps.c
@@ -273,10 +273,9 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
                force_sig_fault(signo, code, (void __user *)far);
 }
 
-void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
-                          const char *str, int pkey)
+void arm64_force_sig_fault_pkey(unsigned long far, const char *str, int pkey)
 {
-       arm64_show_signal(signo, str);
+       arm64_show_signal(SIGSEGV, str);
        force_sig_pkuerr((void __user *)far, pkey);
 }


> 
> 
> > +               else
> > +                       arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
> 
> Otherwise yes, I think splitting things this way makes sense.
> 
> Cheers
> ---Dave


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-06 10:35         ` Joey Gouly
@ 2024-08-06 14:31           ` Joey Gouly
  2024-08-06 15:00             ` Dave Martin
  2024-08-14 15:03             ` Catalin Marinas
  0 siblings, 2 replies; 146+ messages in thread
From: Joey Gouly @ 2024-08-06 14:31 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 06, 2024 at 11:35:32AM +0100, Joey Gouly wrote:
> On Thu, Aug 01, 2024 at 05:22:45PM +0100, Dave Martin wrote:
> > On Thu, Aug 01, 2024 at 04:54:41PM +0100, Joey Gouly wrote:
> > > On Thu, Jul 25, 2024 at 05:00:18PM +0100, Dave Martin wrote:
> > > > Hi,
> > > > 
> > > > On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> > > > > Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> > > > > 
> > > > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > Cc: Will Deacon <will@kernel.org>
> > > > > Reviewed-by: Mark Brown <broonie@kernel.org>
> > > > > Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > > > > ---
> > > > >  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
> > > > >  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
> > > > >  2 files changed, 59 insertions(+)
> > > > > 
> > > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> > > > > index 8a45b7a411e0..e4cba8a6c9a2 100644
> > > > > --- a/arch/arm64/include/uapi/asm/sigcontext.h
> > > > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> > > > 
> > > > [...]
> > > > 
> > > > > @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> > > > >  			return err;
> > > > >  	}
> > > > >  
> > > > > +	if (system_supports_poe()) {
> > > > > +		err = sigframe_alloc(user, &user->poe_offset,
> > > > > +				     sizeof(struct poe_context));
> > > > > +		if (err)
> > > > > +			return err;
> > > > > +	}
> > > > > +
> > > > >  	return sigframe_alloc_end(user);
> > > > >  }
> > > > >  
> > > > > @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> > > > >  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
> > > > >  	}
> > > > >  
> > > > > +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> > > > > +		struct poe_context __user *poe_ctx =
> > > > > +			apply_user_offset(user, user->poe_offset);
> > > > > +
> > > > > +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> > > > > +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> > > > > +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> > > > > +	}
> > > > > +
> > > > 
> > > > Does the AArch64 procedure call standard say anything about whether
> > > > POR_EL0 is caller-saved?
> > > 
> > > I asked about this, and it doesn't say anything and they don't plan on it,
> > > since it's very application specific.
> > 
> > Right.  I think that confirms that we don't absolutely need to preserve
> > POR_EL0, because if compiler-generated code was allowed to fiddle with
> > this and not clean up after itself, the PCS would have to document this.
> > 
> > > > 
> > > > <bikeshed>
> > > > 
> > > > In theory we could skip saving this register if it is already
> > > > POR_EL0_INIT (which it often will be), and if the signal handler is not
> > > > supposed to modify and leave the modified value in the register when
> > > > returning.
> > > > 
> > > > The complexity of the additional check my be a bit pointless though,
> > > > and the the handler might theoretically want to change the interrupted
> > > > code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
> > > > sometimes there and sometimes not.
> > > > 
> > > > </bikeshed>
> > > > 
> > > I think trying to skip/optimise something here would be more effort than any
> > > possible benefits!
> > 
> > Actually, having thought about this some more I think that only dumping
> > this register if != POR_EL0_INIT may be right right thing to do.
> > 
> > This would mean that old binary would stacks never see poe_context in
> > the signal frame, and so will never experience unexpected stack
> > overruns (at least, not due solely to the presence of this feature).
> 
> They can already see things they don't expect, like FPMR that was added
> recently.
> 
> > 
> > POE-aware signal handlers have to do something fiddly and nonportable
> > to obtain the original value of POR_EL0 regardless, so requiring them
> > do handle both cases (present in sigframe and absent) doesn't seem too
> > onerous to me.
> 
> If the signal handler wanted to modify the value, from the default, wouldn't it
> need to mess around with the sig context stuff, to allocate some space for
> POR_EL0, such that the kernel would restore it properly? (If that's even
> possible).
> 
> > 
> > 
> > Do you think this approach would break any known use cases?
> 
> Not sure.
> 

We talked about this offline, helped me understand it more, and I think
something like this makes sense:

diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c
index 561986947530..ca7d4e0be275 100644
--- arch/arm64/kernel/signal.c
+++ arch/arm64/kernel/signal.c
@@ -1024,7 +1025,10 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
                        return err;
        }
 
-       if (system_supports_poe()) {
+       if (system_supports_poe() &&
+                       (add_all ||
+                        mm_pkey_allocation_map(current->mm) != 0x1 ||
+                        read_sysreg_s(SYS_POR_EL0) != POR_EL0_INIT)) {
                err = sigframe_alloc(user, &user->poe_offset,
                                     sizeof(struct poe_context));
                if (err)


That is, we only save the POR_EL0 value if any pkeys have been allocated (other
than pkey 0) *or* if POR_EL0 is a non-default value.

The latter case is a corner case, where a userspace would have changed POR_EL0
before allocating any extra pkeys.
That could be:
	- pkey 0, if it restricts pkey 0 without allocating other pkeys, it's
	  unlikely the program can do anything useful anyway
	- Another pkey, which userspace probably shouldn't do anyway.
	  The man pages say:
		The kernel guarantees that the contents of the hardware rights
		register (PKRU) will be preserved only for allocated protection keys. Any time
		a key is unallocated (either before the first call returning that key from
		pkey_alloc() or after it is freed via pkey_free()), the kernel may make
		arbitrary changes to the parts of the rights register affecting access to that
		key.
	  So userspace shouldn't be changing POR_EL0 before allocating pkeys anyway..

Thanks,
Joey

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults
  2024-08-06 13:43         ` Joey Gouly
@ 2024-08-06 14:38           ` Dave Martin
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-06 14:38 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

Hi,

On Tue, Aug 06, 2024 at 02:43:57PM +0100, Joey Gouly wrote:
> On Tue, Aug 06, 2024 at 02:33:37PM +0100, Dave Martin wrote:
> > Hi,
> > 
> > On Thu, Aug 01, 2024 at 05:01:10PM +0100, Joey Gouly wrote:
> > > On Thu, Jul 25, 2024 at 04:57:09PM +0100, Dave Martin wrote:
> > > > On Fri, May 03, 2024 at 02:01:33PM +0100, Joey Gouly wrote:
> > > > > If a memory fault occurs that is due to an overlay/pkey fault, report that to
> > > > > userspace with a SEGV_PKUERR.
> > > > > 
> > > > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > Cc: Will Deacon <will@kernel.org>
> > > > > ---
> > > > >  arch/arm64/include/asm/traps.h |  1 +
> > > > >  arch/arm64/kernel/traps.c      | 12 ++++++--
> > > > >  arch/arm64/mm/fault.c          | 56 ++++++++++++++++++++++++++++++++--
> > > > >  3 files changed, 64 insertions(+), 5 deletions(-)

[...]

> > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > > index 215e6d7f2df8..1bac6c84d3f5 100644
> > > > > --- a/arch/arm64/kernel/traps.c
> > > > > +++ b/arch/arm64/kernel/traps.c
> > > > > @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
> > > > >  	__show_regs(regs);
> > > > >  }
> > > > >  
> > > > > -void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > > > -			   const char *str)
> > > > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > > > > +			   const char *str, int pkey)
> > > > >  {
> > > > >  	arm64_show_signal(signo, str);
> > > > >  	if (signo == SIGKILL)
> > > > >  		force_sig(SIGKILL);
> > > > > +	else if (code == SEGV_PKUERR)
> > > > > +		force_sig_pkuerr((void __user *)far, pkey);
> > > > 
> > > > Is signo definitely SIGSEGV here?  It looks to me like we can get in
> > > > here for SIGBUS, SIGTRAP etc.
> > > > 
> > > > si_codes are not unique between different signo here, so I'm wondering
> > > > whether this should this be:
> > > > 
> > > > 	else if (signo == SIGSEGV && code == SEGV_PKUERR)
> > > > 
> > > > ...?
> > > > 
> > > > 
> > > > >  	else
> > > > >  		force_sig_fault(signo, code, (void __user *)far);
> > > > >  }
> > > > >  
> > > > > +void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > > > > +			   const char *str)
> > > > > +{
> > > > > +	arm64_force_sig_fault_pkey(signo, code, far, str, 0);
> > > > 
> > > > Is there a reason not to follow the same convention as elsewhere, where
> > > > -1 is passed for "no pkey"?
> > > > 
> > > > If we think this should never be called with signo == SIGSEGV &&
> > > > code == SEGV_PKUERR and no valid pkey but if it's messy to prove, then
> > > > maybe a WARN_ON_ONCE() would be worth it here?
> > > > 
> > > 
> > > Anshuman suggested to separate them out, which I did like below, I think that
> > > addresses your comments too?
> > > 
> > > diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
> > > index 215e6d7f2df8..49bac9ae04c0 100644
> > > --- arch/arm64/kernel/traps.c
> > > +++ arch/arm64/kernel/traps.c
> > > @@ -273,6 +273,13 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
> > >                 force_sig_fault(signo, code, (void __user *)far);
> > >  }
> > >  
> > > +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> > > +                          const char *str, int pkey)
> > > +{
> > > +       arm64_show_signal(signo, str);
> > > +       force_sig_pkuerr((void __user *)far, pkey);
> > > +}
> > > +
> > >  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
> > >                             const char *str)
> > >  {
> > > 
> > > diff --git arch/arm64/mm/fault.c arch/arm64/mm/fault.c
> > > index 451ba7cbd5ad..1ddd46b97f88 100644
> > > --- arch/arm64/mm/fault.c
> > > +++ arch/arm64/mm/fault.c
> > 
> > (Guessing where this is means to apply, since there is no hunk header
> > or context...)
> 
> Sorry I had some other changes and just mashed the bits into a diff-looking-thing.

Fair enough.  There are a few similar bits of code, so including more
lines of context would have been helpful.

The change looked reasonable though.

> > > 
> > > -               arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
> > > +               if (si_code == SEGV_PKUERR)
> > > +                       arm64_force_sig_fault_pkey(SIGSEGV, si_code, far, inf->name, pkey);
> > 
> > Maybe drop the the signo and si_code argument?  This would mean that
> > arm64_force_sig_fault_pkey() can't be called with a signo/si_code
> > combination that makes no sense.
> > 
> > I think pkey faults are always going to be SIGSEGV/SEGV_PKUERR, right?
> > Or are there other combinations that can apply for these faults?
> 
> Ah yes, I can simplify it even more, thanks.
> 
> diff --git arch/arm64/kernel/traps.c arch/arm64/kernel/traps.c
> index 49bac9ae04c0..d9abb8b390c0 100644
> --- arch/arm64/kernel/traps.c
> +++ arch/arm64/kernel/traps.c
> @@ -273,10 +273,9 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
>                 force_sig_fault(signo, code, (void __user *)far);
>  }
>  
> -void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
> -                          const char *str, int pkey)
> +void arm64_force_sig_fault_pkey(unsigned long far, const char *str, int pkey)
>  {
> -       arm64_show_signal(signo, str);
> +       arm64_show_signal(SIGSEGV, str);
>         force_sig_pkuerr((void __user *)far, pkey);
>  }

Looks sensible.

I see that force_sig_pkuerr() fills in the signo and si_code itself.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-06 14:31           ` Joey Gouly
@ 2024-08-06 15:00             ` Dave Martin
  2024-08-14 15:03             ` Catalin Marinas
  1 sibling, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-06 15:00 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 06, 2024 at 03:31:03PM +0100, Joey Gouly wrote:
> On Tue, Aug 06, 2024 at 11:35:32AM +0100, Joey Gouly wrote:
> > On Thu, Aug 01, 2024 at 05:22:45PM +0100, Dave Martin wrote:
> > > On Thu, Aug 01, 2024 at 04:54:41PM +0100, Joey Gouly wrote:
> > > > On Thu, Jul 25, 2024 at 05:00:18PM +0100, Dave Martin wrote:
> > > > > Hi,
> > > > > 
> > > > > On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote:
> > > > > > Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
> > > > > > 
> > > > > > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > Reviewed-by: Mark Brown <broonie@kernel.org>
> > > > > > Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > > > > > ---
> > > > > >  arch/arm64/include/uapi/asm/sigcontext.h |  7 ++++
> > > > > >  arch/arm64/kernel/signal.c               | 52 ++++++++++++++++++++++++
> > > > > >  2 files changed, 59 insertions(+)
> > > > > > 
> > > > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
> > > > > > index 8a45b7a411e0..e4cba8a6c9a2 100644
> > > > > > --- a/arch/arm64/include/uapi/asm/sigcontext.h
> > > > > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > @@ -980,6 +1013,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> > > > > >  			return err;
> > > > > >  	}
> > > > > >  
> > > > > > +	if (system_supports_poe()) {
> > > > > > +		err = sigframe_alloc(user, &user->poe_offset,
> > > > > > +				     sizeof(struct poe_context));
> > > > > > +		if (err)
> > > > > > +			return err;
> > > > > > +	}
> > > > > > +
> > > > > >  	return sigframe_alloc_end(user);
> > > > > >  }
> > > > > >  
> > > > > > @@ -1020,6 +1060,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> > > > > >  		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
> > > > > >  	}
> > > > > >  
> > > > > > +	if (system_supports_poe() && err == 0 && user->poe_offset) {
> > > > > > +		struct poe_context __user *poe_ctx =
> > > > > > +			apply_user_offset(user, user->poe_offset);
> > > > > > +
> > > > > > +		__put_user_error(POE_MAGIC, &poe_ctx->head.magic, err);
> > > > > > +		__put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err);
> > > > > > +		__put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err);
> > > > > > +	}
> > > > > > +
> > > > > 
> > > > > Does the AArch64 procedure call standard say anything about whether
> > > > > POR_EL0 is caller-saved?
> > > > 
> > > > I asked about this, and it doesn't say anything and they don't plan on it,
> > > > since it's very application specific.
> > > 
> > > Right.  I think that confirms that we don't absolutely need to preserve
> > > POR_EL0, because if compiler-generated code was allowed to fiddle with
> > > this and not clean up after itself, the PCS would have to document this.
> > > 
> > > > > 
> > > > > <bikeshed>
> > > > > 
> > > > > In theory we could skip saving this register if it is already
> > > > > POR_EL0_INIT (which it often will be), and if the signal handler is not
> > > > > supposed to modify and leave the modified value in the register when
> > > > > returning.
> > > > > 
> > > > > The complexity of the additional check my be a bit pointless though,
> > > > > and the the handler might theoretically want to change the interrupted
> > > > > code's POR_EL0 explicitly, which would be complicated if POE_MAGIC is
> > > > > sometimes there and sometimes not.
> > > > > 
> > > > > </bikeshed>
> > > > > 
> > > > I think trying to skip/optimise something here would be more effort than any
> > > > possible benefits!
> > > 
> > > Actually, having thought about this some more I think that only dumping
> > > this register if != POR_EL0_INIT may be right right thing to do.
> > > 
> > > This would mean that old binary would stacks never see poe_context in
> > > the signal frame, and so will never experience unexpected stack
> > > overruns (at least, not due solely to the presence of this feature).
> > 
> > They can already see things they don't expect, like FPMR that was added
> > recently.
> > 
> > > 
> > > POE-aware signal handlers have to do something fiddly and nonportable
> > > to obtain the original value of POR_EL0 regardless, so requiring them
> > > do handle both cases (present in sigframe and absent) doesn't seem too
> > > onerous to me.
> > 
> > If the signal handler wanted to modify the value, from the default, wouldn't it
> > need to mess around with the sig context stuff, to allocate some space for
> > POR_EL0, such that the kernel would restore it properly? (If that's even
> > possible).
> > 
> > > 
> > > 
> > > Do you think this approach would break any known use cases?
> > 
> > Not sure.
> > 
> 
> We talked about this offline, helped me understand it more, and I think
> something like this makes sense:
> 
> diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c
> index 561986947530..ca7d4e0be275 100644
> --- arch/arm64/kernel/signal.c
> +++ arch/arm64/kernel/signal.c
> @@ -1024,7 +1025,10 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>                         return err;
>         }
>  
> -       if (system_supports_poe()) {
> +       if (system_supports_poe() &&
> +                       (add_all ||
> +                        mm_pkey_allocation_map(current->mm) != 0x1 ||

We probably ought to holding the mm lock for read around this (as in
mm/mprotect.c:pkey_alloc()), or have a wrapper to encapsulate that.

Signal delivery is not fast path, so I think we should stick to the
simple and obviously correct approach rather than trying to be
lockless (at least until somebody comes up with a compelling reason to
change it).

If doing that, we should probably put the condition on the allocation
map last so that we don't take the lock unnecessarily.

> +                        read_sysreg_s(SYS_POR_EL0) != POR_EL0_INIT)) {
>                 err = sigframe_alloc(user, &user->poe_offset,
>                                      sizeof(struct poe_context));
>                 if (err)
> 
> 
> That is, we only save the POR_EL0 value if any pkeys have been allocated (other
> than pkey 0) *or* if POR_EL0 is a non-default value.
> 
> The latter case is a corner case, where a userspace would have changed POR_EL0
> before allocating any extra pkeys.
> That could be:
> 	- pkey 0, if it restricts pkey 0 without allocating other pkeys, it's
> 	  unlikely the program can do anything useful anyway
> 	- Another pkey, which userspace probably shouldn't do anyway.
> 	  The man pages say:
> 		The kernel guarantees that the contents of the hardware rights
> 		register (PKRU) will be preserved only for allocated protection keys. Any time
> 		a key is unallocated (either before the first call returning that key from
> 		pkey_alloc() or after it is freed via pkey_free()), the kernel may make
> 		arbitrary changes to the parts of the rights register affecting access to that
> 		key.
> 	  So userspace shouldn't be changing POR_EL0 before allocating pkeys anyway..
> 
> Thanks,
> Joey

This seems better, thanks.

I'll leave it for others to comment on whether this is an issue for any
pkeys use case, but it does mean that non-POE-aware arm64 code
shouldn't see any impact on the signal handling side.

Your new approach means that poe_context is always present in the
sigframe for code that is using POE, so I think that reasonable
scenarios of wanting to change the POR_EL0 value for sigreturn ought
to work.

Processes that contain a mixture of POE and non-POE code are a bit more
of a grey area, but I think that libraries should not be arbitrarily
commandeering pkeys since they have no way to be sure that won't break
the calling program...  I'm assuming that we won't have to worry about
that scenario in practice.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-06 14:31           ` Joey Gouly
  2024-08-06 15:00             ` Dave Martin
@ 2024-08-14 15:03             ` Catalin Marinas
  2024-08-15 13:18               ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Catalin Marinas @ 2024-08-14 15:03 UTC (permalink / raw)
  To: Joey Gouly
  Cc: Dave Martin, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, christophe.leroy, dave.hansen, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Hi Joey,

On Tue, Aug 06, 2024 at 03:31:03PM +0100, Joey Gouly wrote:
> diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c
> index 561986947530..ca7d4e0be275 100644
> --- arch/arm64/kernel/signal.c
> +++ arch/arm64/kernel/signal.c
> @@ -1024,7 +1025,10 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>                         return err;
>         }
>  
> -       if (system_supports_poe()) {
> +       if (system_supports_poe() &&
> +                       (add_all ||
> +                        mm_pkey_allocation_map(current->mm) != 0x1 ||
> +                        read_sysreg_s(SYS_POR_EL0) != POR_EL0_INIT)) {
>                 err = sigframe_alloc(user, &user->poe_offset,
>                                      sizeof(struct poe_context));
>                 if (err)
> 
> 
> That is, we only save the POR_EL0 value if any pkeys have been allocated (other
> than pkey 0) *or* if POR_EL0 is a non-default value.

I had a chat with Dave as well on this and, in principle, we don't want
to add stuff to the signal frame unnecessarily, especially for old
binaries that have no clue of pkeys. OTOH, it looks like too complicated
for just 16 bytes. Also POR_EL0 all RWX is a valid combination, I don't
think we should exclude it.

If no pkey has been allocated, I guess we could skip this and it also
matches the x86 description of the PKRU being guaranteed to be preserved
only for the allocated keys. Do we reserve pkey 0 for arm64? I thought
that's only an x86 thing to emulate execute-only mappings.

Another corner case would be the signal handler doing a pkey_alloc() and
willing to populate POR_EL0 on sigreturn. It will have to find room in
the signal handler, though I don't think that's a problem.

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-14 15:03             ` Catalin Marinas
@ 2024-08-15 13:18               ` Joey Gouly
  2024-08-15 15:09                 ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-15 13:18 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Dave Martin, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, christophe.leroy, dave.hansen, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

Hi Catalin,

On Wed, Aug 14, 2024 at 04:03:47PM +0100, Catalin Marinas wrote:
> Hi Joey,
> 
> On Tue, Aug 06, 2024 at 03:31:03PM +0100, Joey Gouly wrote:
> > diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c
> > index 561986947530..ca7d4e0be275 100644
> > --- arch/arm64/kernel/signal.c
> > +++ arch/arm64/kernel/signal.c
> > @@ -1024,7 +1025,10 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> >                         return err;
> >         }
> >  
> > -       if (system_supports_poe()) {
> > +       if (system_supports_poe() &&
> > +                       (add_all ||
> > +                        mm_pkey_allocation_map(current->mm) != 0x1 ||
> > +                        read_sysreg_s(SYS_POR_EL0) != POR_EL0_INIT)) {
> >                 err = sigframe_alloc(user, &user->poe_offset,
> >                                      sizeof(struct poe_context));
> >                 if (err)
> > 
> > 
> > That is, we only save the POR_EL0 value if any pkeys have been allocated (other
> > than pkey 0) *or* if POR_EL0 is a non-default value.
> 
> I had a chat with Dave as well on this and, in principle, we don't want
> to add stuff to the signal frame unnecessarily, especially for old
> binaries that have no clue of pkeys. OTOH, it looks like too complicated
> for just 16 bytes. Also POR_EL0 all RWX is a valid combination, I don't
> think we should exclude it.
> 
> If no pkey has been allocated, I guess we could skip this and it also
> matches the x86 description of the PKRU being guaranteed to be preserved
> only for the allocated keys. Do we reserve pkey 0 for arm64? I thought
> that's only an x86 thing to emulate execute-only mappings.

To make it less complicated, I could drop the POR_EL0 check and just do:

-       if (system_supports_poe()) {
+       if (system_supports_poe() &&
+                       (add_all ||
+                        mm_pkey_allocation_map(current->mm) != 0x1) {

This wouldn't preserve the value of POR_EL0 if no pkeys had been allocated, but
that is fine, as you said / the man pages say.

We don't preserve pkey 0, but it is the default for mappings and defaults to
RWX. So changing it probably will lead to unexpected things.

> 
> Another corner case would be the signal handler doing a pkey_alloc() and
> willing to populate POR_EL0 on sigreturn. It will have to find room in
> the signal handler, though I don't think that's a problem.

pkey_alloc() doesn't appear in the signal safety man page, but that might just
be an omission due to permission keys being newer, than actually saying
pkey_alloc() can't be used.

If POR_EL0 isn't in the sig context, I think the signal handler could just
write the POR_EL0 system register directly? The kernel wouldn't restore POR_EL0
in that case, so the value set in the signal handler would just be preserved.

The reason that trying to preserve the value of POR_EL0 without any pkeys
allocated (like in the patch in my previous e-mail had) doesn't really make
sense, is that when you do pkey_alloc() you have to pass an initial value for
the pkey, so that will overwite what you may have manually written into
POR_EL0. Also you can't pass an unallocated pkey value to pkey_mprotect().


That's a lot of words to say, or ask, do you agree with the approach of only
saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-15 13:18               ` Joey Gouly
@ 2024-08-15 15:09                 ` Dave Martin
  2024-08-15 15:24                   ` Mark Brown
  2024-08-19 17:09                   ` Catalin Marinas
  0 siblings, 2 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-15 15:09 UTC (permalink / raw)
  To: Joey Gouly
  Cc: Catalin Marinas, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> Hi Catalin,
> 
> On Wed, Aug 14, 2024 at 04:03:47PM +0100, Catalin Marinas wrote:
> > Hi Joey,
> > 
> > On Tue, Aug 06, 2024 at 03:31:03PM +0100, Joey Gouly wrote:
> > > diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c
> > > index 561986947530..ca7d4e0be275 100644
> > > --- arch/arm64/kernel/signal.c
> > > +++ arch/arm64/kernel/signal.c
> > > @@ -1024,7 +1025,10 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> > >                         return err;
> > >         }
> > >  
> > > -       if (system_supports_poe()) {
> > > +       if (system_supports_poe() &&
> > > +                       (add_all ||
> > > +                        mm_pkey_allocation_map(current->mm) != 0x1 ||
> > > +                        read_sysreg_s(SYS_POR_EL0) != POR_EL0_INIT)) {
> > >                 err = sigframe_alloc(user, &user->poe_offset,
> > >                                      sizeof(struct poe_context));
> > >                 if (err)
> > > 
> > > 
> > > That is, we only save the POR_EL0 value if any pkeys have been allocated (other
> > > than pkey 0) *or* if POR_EL0 is a non-default value.
> > 
> > I had a chat with Dave as well on this and, in principle, we don't want
> > to add stuff to the signal frame unnecessarily, especially for old
> > binaries that have no clue of pkeys. OTOH, it looks like too complicated
> > for just 16 bytes. Also POR_EL0 all RWX is a valid combination, I don't
> > think we should exclude it.

Unfortunately, this is always going to be the obviously simpler and
more robust option for dealing with any new register state.

In effect, the policy will be to push back on unconditional additions
to the signal frame, except for 100% of proposed additions...


I'm coming round to the view that trying to provide absolute guarantees
about the signal frame size is unsustainable.  x86 didn't, and got away
with it for some time...  Maybe we should just get rid of the relevant
comments in headers, and water down guarantees in the SVE/SME
documentation to recommendations with no promise attached?

I can propose a patch for that.

> > 
> > If no pkey has been allocated, I guess we could skip this and it also
> > matches the x86 description of the PKRU being guaranteed to be preserved
> > only for the allocated keys. Do we reserve pkey 0 for arm64? I thought
> > that's only an x86 thing to emulate execute-only mappings.

It's not clear whether pkey 0 is reserved in the sense of being
permanently allocated, or in the sense of being unavailable for
allocation.

Since userspace gets pages with pkey 0 by default and can fiddle with
the permissions on POR_EL0 and set this pkey onto pages using
pkey_mprotect(), I'd say pkey 0 counts as always allocated; and the
value of the POR_EL0 bits for pkey 0 needs to be maintained.

> 
> To make it less complicated, I could drop the POR_EL0 check and just do:
> 
> -       if (system_supports_poe()) {
> +       if (system_supports_poe() &&
> +                       (add_all ||
> +                        mm_pkey_allocation_map(current->mm) != 0x1) {
> 
> This wouldn't preserve the value of POR_EL0 if no pkeys had been allocated, but
> that is fine, as you said / the man pages say.
> 
> We don't preserve pkey 0, but it is the default for mappings and defaults to
> RWX. So changing it probably will lead to unexpected things.
> 
> > 
> > Another corner case would be the signal handler doing a pkey_alloc() and
> > willing to populate POR_EL0 on sigreturn. It will have to find room in
> > the signal handler, though I don't think that's a problem.
> 
> pkey_alloc() doesn't appear in the signal safety man page, but that might just
> be an omission due to permission keys being newer, than actually saying
> pkey_alloc() can't be used.

In practice this is likely to be a thin syscall wrapper; those are
async-signal-safe in practice on Linux (but documentation tends to take
a while to catch up).  (Exceptions exists where "safe" calls are used
in ways that interfere with the internal operation of libc... but those
cases are mostly at least semi-obvious and rarely documented.)

Using pkey_alloc() in a signal handler doesn't seem a great idea for
more straightforward reasons, though:  pkeys are a scarce, per-process
resource, and allocating them asynchronously in the presence of other
threads etc., doesn't seem like a recipe for success.

I haven't looked, but I'd be surprised if there's any code doing this!

Generally, it's too late to allocate any non-trivial kind of resource
one you're in a signal handler... you need to plan ahead.

> 
> If POR_EL0 isn't in the sig context, I think the signal handler could just
> write the POR_EL0 system register directly? The kernel wouldn't restore POR_EL0
> in that case, so the value set in the signal handler would just be preserved.
> 
> The reason that trying to preserve the value of POR_EL0 without any pkeys
> allocated (like in the patch in my previous e-mail had) doesn't really make
> sense, is that when you do pkey_alloc() you have to pass an initial value for
> the pkey, so that will overwite what you may have manually written into
> POR_EL0. Also you can't pass an unallocated pkey value to pkey_mprotect().

My argument here was that from the signal handler's point of view, the
POR_EL0 value of the interrupted context lives in the sigframe if it's
there (and will then be restored from there), and directly in POR_EL0
otherwise.  Parsing the sigframe determine where the image of POR_EL0 is.

I see two potential problems.

1) (probably not a big deal in practice)

If the signal handler wants to withdraw a permission from pkey 0 for
the interrupted context, and the interrupted context had no permission
on any other pkey (so POR_EL0 is not dumped and the handler must update
POR_EL0 directly before returning).

In this scenario, the interrupted code would explode on return unless
it can cope with globally execute-only or execute-read-only permissions.
(no-execute is obviously dead on arrival).

If a signal handler really really wanted to do this, it could return
through an asm trampoline that is able to cope with the reduced
permissions.  This seems like a highly contrived scenario though, and I
can't see how it could be useful...

2) (possibly a bigger deal) pkeys(7) does say explicitly (well, sort of)
that the PKRU bits are restored on sigreturn.

Since there are generic APIs to manipulate pkeys, it might cause
problems if sigreturn restores the pkey permissions on some arches
but not on others.  Some non-x86-specific software might already be 
relying on the restoration of the permissions.


> That's a lot of words to say, or ask, do you agree with the approach of only
> saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> 
> Thanks,
> Joey

...So..., given all the above, it is perhaps best to go back to
dumping POR_EL0 unconditionally after all, unless we have a mechanism
to determine whether pkeys are in use at all.

If we initially trapped POR_EL0, and set a flag the first time it is
accessed or one of the pkeys syscalls is used, then we could dump
POR_EL0 conditionally based on that: once the flag is set, we always
dump it for that process.  If the first POR_EL0 access or pkeys API
interaction is in a signal handler, and that handler modifies POR_EL0,
then it wouldn't get restored (or at least, not automatically).  Not
sure if this would ever matter in practice.

It's potentially fiddly to make it work 100% consistently though (does
a sigreturn count as a write?  What if the first access to POR_EL0 is
through ptrace, etc.?)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-15 15:09                 ` Dave Martin
@ 2024-08-15 15:24                   ` Mark Brown
  2024-08-19 17:09                   ` Catalin Marinas
  1 sibling, 0 replies; 146+ messages in thread
From: Mark Brown @ 2024-08-15 15:24 UTC (permalink / raw)
  To: Dave Martin
  Cc: Joey Gouly, Catalin Marinas, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave Martin wrote:

> I'm coming round to the view that trying to provide absolute guarantees
> about the signal frame size is unsustainable.  x86 didn't, and got away
> with it for some time...  Maybe we should just get rid of the relevant
> comments in headers, and water down guarantees in the SVE/SME
> documentation to recommendations with no promise attached?

I tend to agree, especially given that even within the fixed size our
layout is variable.  It creates contortions and realistically the big
issue is the vector extensions rather than anything else.  There we're
kind of constrained in what we can do.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 07/29] KVM: arm64: Save/restore POE registers
  2024-05-03 13:01 ` [PATCH v4 07/29] KVM: arm64: Save/restore POE registers Joey Gouly
  2024-05-29 15:43   ` Marc Zyngier
@ 2024-08-16 14:55   ` Marc Zyngier
  2024-08-16 15:13     ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Marc Zyngier @ 2024-08-16 14:55 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, 03 May 2024 14:01:25 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> Define the new system registers that POE introduces and context switch them.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_host.h          |  4 +++
>  arch/arm64/include/asm/vncr_mapping.h      |  1 +
>  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 29 ++++++++++++++++++++++
>  arch/arm64/kvm/sys_regs.c                  |  8 ++++--
>  4 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 9e8a496fb284..28042da0befd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -419,6 +419,8 @@ enum vcpu_sysreg {
>  	GCR_EL1,	/* Tag Control Register */
>  	TFSRE0_EL1,	/* Tag Fault Status Register (EL0) */
>  
> +	POR_EL0,	/* Permission Overlay Register 0 (EL0) */
> +
>  	/* 32bit specific registers. */
>  	DACR32_EL2,	/* Domain Access Control Register */
>  	IFSR32_EL2,	/* Instruction Fault Status Register */
> @@ -489,6 +491,8 @@ enum vcpu_sysreg {
>  	VNCR(PIR_EL1),	 /* Permission Indirection Register 1 (EL1) */
>  	VNCR(PIRE0_EL1), /*  Permission Indirection Register 0 (EL1) */
>  
> +	VNCR(POR_EL1),	/* Permission Overlay Register 1 (EL1) */
> +
>  	VNCR(HFGRTR_EL2),
>  	VNCR(HFGWTR_EL2),
>  	VNCR(HFGITR_EL2),
> diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h
> index df2c47c55972..06f8ec0906a6 100644
> --- a/arch/arm64/include/asm/vncr_mapping.h
> +++ b/arch/arm64/include/asm/vncr_mapping.h
> @@ -52,6 +52,7 @@
>  #define VNCR_PIRE0_EL1		0x290
>  #define VNCR_PIRE0_EL2		0x298
>  #define VNCR_PIR_EL1		0x2A0
> +#define VNCR_POR_EL1		0x2A8
>  #define VNCR_ICH_LR0_EL2        0x400
>  #define VNCR_ICH_LR1_EL2        0x408
>  #define VNCR_ICH_LR2_EL2        0x410
> diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> index 4be6a7fa0070..1c9536557bae 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> @@ -16,9 +16,15 @@
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  
> +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt);
> +
>  static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
>  {
>  	ctxt_sys_reg(ctxt, MDSCR_EL1)	= read_sysreg(mdscr_el1);
> +
> +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> +	if (ctxt_has_s1poe(ctxt))
> +		ctxt_sys_reg(ctxt, POR_EL0)	= read_sysreg_s(SYS_POR_EL0);
>  }
>  
>  static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
> @@ -55,6 +61,17 @@ static inline bool ctxt_has_s1pie(struct kvm_cpu_context *ctxt)
>  	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1PIE, IMP);
>  }
>  
> +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt)
> +{
> +	struct kvm_vcpu *vcpu;
> +
> +	if (!system_supports_poe())
> +		return false;
> +
> +	vcpu = ctxt_to_vcpu(ctxt);
> +	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1POE, IMP);
> +}
> +
>  static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  {
>  	ctxt_sys_reg(ctxt, SCTLR_EL1)	= read_sysreg_el1(SYS_SCTLR);
> @@ -77,6 +94,10 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
>  		ctxt_sys_reg(ctxt, PIR_EL1)	= read_sysreg_el1(SYS_PIR);
>  		ctxt_sys_reg(ctxt, PIRE0_EL1)	= read_sysreg_el1(SYS_PIRE0);
>  	}
> +
> +	if (ctxt_has_s1poe(ctxt))
> +		ctxt_sys_reg(ctxt, POR_EL1)	= read_sysreg_el1(SYS_POR);
> +
>  	ctxt_sys_reg(ctxt, PAR_EL1)	= read_sysreg_par();
>  	ctxt_sys_reg(ctxt, TPIDR_EL1)	= read_sysreg(tpidr_el1);
>  
> @@ -107,6 +128,10 @@ static inline void __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
>  static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
>  {
>  	write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1),  mdscr_el1);
> +
> +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> +	if (ctxt_has_s1poe(ctxt))
> +		write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0),	SYS_POR_EL0);
>  }
>  
>  static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
> @@ -153,6 +178,10 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1),	SYS_PIR);
>  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1),	SYS_PIRE0);
>  	}
> +
> +	if (ctxt_has_s1poe(ctxt))
> +		write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1),	SYS_POR);
> +
>  	write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),	par_el1);
>  	write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),	tpidr_el1);
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index c9f4f387155f..be04fae35afb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2423,6 +2423,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
>  	{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
>  	{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
> +	{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },
>  	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
>  
>  	{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
> @@ -2506,6 +2507,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	  .access = access_pmovs, .reg = PMOVSSET_EL0,
>  	  .get_user = get_pmreg, .set_user = set_pmreg },
>  
> +	{ SYS_DESC(SYS_POR_EL0), NULL, reset_unknown, POR_EL0 },
>  	{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
>  	{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
>  	{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
> @@ -4057,8 +4059,6 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
>  	kvm->arch.fgu[HFGxTR_GROUP] = (HFGxTR_EL2_nAMAIR2_EL1		|
>  				       HFGxTR_EL2_nMAIR2_EL1		|
>  				       HFGxTR_EL2_nS2POR_EL1		|
> -				       HFGxTR_EL2_nPOR_EL1		|
> -				       HFGxTR_EL2_nPOR_EL0		|
>  				       HFGxTR_EL2_nACCDATA_EL1		|
>  				       HFGxTR_EL2_nSMPRI_EL1_MASK	|
>  				       HFGxTR_EL2_nTPIDR2_EL0_MASK);
> @@ -4093,6 +4093,10 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
>  		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPIRE0_EL1 |
>  						HFGxTR_EL2_nPIR_EL1);
>  
> +	if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
> +		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
> +						HFGxTR_EL2_nPOR_EL0);
> +

As Broonie pointed out in a separate thread, this cannot work, short
of making ID_AA64MMFR3_EL1 writable.

This can be done in a separate patch, but it needs doing as it
otherwise breaks migration.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 07/29] KVM: arm64: Save/restore POE registers
  2024-08-16 14:55   ` Marc Zyngier
@ 2024-08-16 15:13     ` Joey Gouly
  2024-08-16 15:32       ` Marc Zyngier
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-16 15:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, Aug 16, 2024 at 03:55:11PM +0100, Marc Zyngier wrote:
> On Fri, 03 May 2024 14:01:25 +0100,
> Joey Gouly <joey.gouly@arm.com> wrote:
> > 
> > Define the new system registers that POE introduces and context switch them.
> > 
> > Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Oliver Upton <oliver.upton@linux.dev>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/kvm_host.h          |  4 +++
> >  arch/arm64/include/asm/vncr_mapping.h      |  1 +
> >  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 29 ++++++++++++++++++++++
> >  arch/arm64/kvm/sys_regs.c                  |  8 ++++--
> >  4 files changed, 40 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 9e8a496fb284..28042da0befd 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -419,6 +419,8 @@ enum vcpu_sysreg {
> >  	GCR_EL1,	/* Tag Control Register */
> >  	TFSRE0_EL1,	/* Tag Fault Status Register (EL0) */
> >  
> > +	POR_EL0,	/* Permission Overlay Register 0 (EL0) */
> > +
> >  	/* 32bit specific registers. */
> >  	DACR32_EL2,	/* Domain Access Control Register */
> >  	IFSR32_EL2,	/* Instruction Fault Status Register */
> > @@ -489,6 +491,8 @@ enum vcpu_sysreg {
> >  	VNCR(PIR_EL1),	 /* Permission Indirection Register 1 (EL1) */
> >  	VNCR(PIRE0_EL1), /*  Permission Indirection Register 0 (EL1) */
> >  
> > +	VNCR(POR_EL1),	/* Permission Overlay Register 1 (EL1) */
> > +
> >  	VNCR(HFGRTR_EL2),
> >  	VNCR(HFGWTR_EL2),
> >  	VNCR(HFGITR_EL2),
> > diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h
> > index df2c47c55972..06f8ec0906a6 100644
> > --- a/arch/arm64/include/asm/vncr_mapping.h
> > +++ b/arch/arm64/include/asm/vncr_mapping.h
> > @@ -52,6 +52,7 @@
> >  #define VNCR_PIRE0_EL1		0x290
> >  #define VNCR_PIRE0_EL2		0x298
> >  #define VNCR_PIR_EL1		0x2A0
> > +#define VNCR_POR_EL1		0x2A8
> >  #define VNCR_ICH_LR0_EL2        0x400
> >  #define VNCR_ICH_LR1_EL2        0x408
> >  #define VNCR_ICH_LR2_EL2        0x410
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> > index 4be6a7fa0070..1c9536557bae 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
> > @@ -16,9 +16,15 @@
> >  #include <asm/kvm_hyp.h>
> >  #include <asm/kvm_mmu.h>
> >  
> > +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt);
> > +
> >  static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
> >  {
> >  	ctxt_sys_reg(ctxt, MDSCR_EL1)	= read_sysreg(mdscr_el1);
> > +
> > +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> > +	if (ctxt_has_s1poe(ctxt))
> > +		ctxt_sys_reg(ctxt, POR_EL0)	= read_sysreg_s(SYS_POR_EL0);
> >  }
> >  
> >  static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
> > @@ -55,6 +61,17 @@ static inline bool ctxt_has_s1pie(struct kvm_cpu_context *ctxt)
> >  	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1PIE, IMP);
> >  }
> >  
> > +static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt)
> > +{
> > +	struct kvm_vcpu *vcpu;
> > +
> > +	if (!system_supports_poe())
> > +		return false;
> > +
> > +	vcpu = ctxt_to_vcpu(ctxt);
> > +	return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1POE, IMP);
> > +}
> > +
> >  static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
> >  {
> >  	ctxt_sys_reg(ctxt, SCTLR_EL1)	= read_sysreg_el1(SYS_SCTLR);
> > @@ -77,6 +94,10 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
> >  		ctxt_sys_reg(ctxt, PIR_EL1)	= read_sysreg_el1(SYS_PIR);
> >  		ctxt_sys_reg(ctxt, PIRE0_EL1)	= read_sysreg_el1(SYS_PIRE0);
> >  	}
> > +
> > +	if (ctxt_has_s1poe(ctxt))
> > +		ctxt_sys_reg(ctxt, POR_EL1)	= read_sysreg_el1(SYS_POR);
> > +
> >  	ctxt_sys_reg(ctxt, PAR_EL1)	= read_sysreg_par();
> >  	ctxt_sys_reg(ctxt, TPIDR_EL1)	= read_sysreg(tpidr_el1);
> >  
> > @@ -107,6 +128,10 @@ static inline void __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
> >  static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
> >  {
> >  	write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1),  mdscr_el1);
> > +
> > +	// POR_EL0 can affect uaccess, so must be saved/restored early.
> > +	if (ctxt_has_s1poe(ctxt))
> > +		write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0),	SYS_POR_EL0);
> >  }
> >  
> >  static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
> > @@ -153,6 +178,10 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
> >  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1),	SYS_PIR);
> >  		write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1),	SYS_PIRE0);
> >  	}
> > +
> > +	if (ctxt_has_s1poe(ctxt))
> > +		write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1),	SYS_POR);
> > +
> >  	write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),	par_el1);
> >  	write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1),	tpidr_el1);
> >  
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index c9f4f387155f..be04fae35afb 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2423,6 +2423,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >  	{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
> >  	{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
> >  	{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
> > +	{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },
> >  	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
> >  
> >  	{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
> > @@ -2506,6 +2507,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> >  	  .access = access_pmovs, .reg = PMOVSSET_EL0,
> >  	  .get_user = get_pmreg, .set_user = set_pmreg },
> >  
> > +	{ SYS_DESC(SYS_POR_EL0), NULL, reset_unknown, POR_EL0 },
> >  	{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
> >  	{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
> >  	{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
> > @@ -4057,8 +4059,6 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
> >  	kvm->arch.fgu[HFGxTR_GROUP] = (HFGxTR_EL2_nAMAIR2_EL1		|
> >  				       HFGxTR_EL2_nMAIR2_EL1		|
> >  				       HFGxTR_EL2_nS2POR_EL1		|
> > -				       HFGxTR_EL2_nPOR_EL1		|
> > -				       HFGxTR_EL2_nPOR_EL0		|
> >  				       HFGxTR_EL2_nACCDATA_EL1		|
> >  				       HFGxTR_EL2_nSMPRI_EL1_MASK	|
> >  				       HFGxTR_EL2_nTPIDR2_EL0_MASK);
> > @@ -4093,6 +4093,10 @@ void kvm_init_sysreg(struct kvm_vcpu *vcpu)
> >  		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPIRE0_EL1 |
> >  						HFGxTR_EL2_nPIR_EL1);
> >  
> > +	if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
> > +		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
> > +						HFGxTR_EL2_nPOR_EL0);
> > +
> 
> As Broonie pointed out in a separate thread, this cannot work, short
> of making ID_AA64MMFR3_EL1 writable.
> 
> This can be done in a separate patch, but it needs doing as it
> otherwise breaks migration.
> 
> Thanks,
> 
> 	M.
> 

Looks like it's wrong for PIE currently too, but your patch here fixes that:
	https://lore.kernel.org/kvmarm/20240813144738.2048302-11-maz@kernel.org/

If I basically apply that patch, but only for POE, the conflict can be resolved
later, or a rebase will fix it up, depending on what goes through first.

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 07/29] KVM: arm64: Save/restore POE registers
  2024-08-16 15:13     ` Joey Gouly
@ 2024-08-16 15:32       ` Marc Zyngier
  0 siblings, 0 replies; 146+ messages in thread
From: Marc Zyngier @ 2024-08-16 15:32 UTC (permalink / raw)
  To: Joey Gouly
  Cc: linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar, bp, broonie,
	catalin.marinas, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, mingo, mpe, naveen.n.rao,
	npiggin, oliver.upton, shuah, szabolcs.nagy, tglx, will, x86,
	kvmarm

On Fri, 16 Aug 2024 16:13:01 +0100,
Joey Gouly <joey.gouly@arm.com> wrote:
> 
> On Fri, Aug 16, 2024 at 03:55:11PM +0100, Marc Zyngier wrote:
> > On Fri, 03 May 2024 14:01:25 +0100,
> > Joey Gouly <joey.gouly@arm.com> wrote:
> > > 
> > > +	if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
> > > +		kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
> > > +						HFGxTR_EL2_nPOR_EL0);
> > > +
> > 
> > As Broonie pointed out in a separate thread, this cannot work, short
> > of making ID_AA64MMFR3_EL1 writable.
> > 
> > This can be done in a separate patch, but it needs doing as it
> > otherwise breaks migration.
> > 
> > Thanks,
> > 
> > 	M.
> > 
> 
> Looks like it's wrong for PIE currently too, but your patch here fixes that:
> 	https://lore.kernel.org/kvmarm/20240813144738.2048302-11-maz@kernel.org/
> 
> If I basically apply that patch, but only for POE, the conflict can be resolved
> later, or a rebase will fix it up, depending on what goes through first.

If I trust my feature dependency decoder, you need to make both
TCRX and POE writable:

(FEAT_S1POE --> v8Ap8)
(FEAT_S1POE --> FEAT_TCR2)
(FEAT_S1POE --> FEAT_ATS1A)
(FEAT_S1POE --> FEAT_HPDS)
(FEAT_S1POE <-> (AArch64 ID_AA64MMFR3_EL1.S1POE >= 1))
(FEAT_TCR2 --> v8Ap0)
(v8Ap9 --> FEAT_TCR2)
((FEAT_TCR2 && FEAT_AA64EL2) --> FEAT_HCX)
(FEAT_TCR2 <-> (AArch64 ID_AA64MMFR3_EL1.TCRX >= 1))

Feel free to lift part of that patch as you see fit.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-15 15:09                 ` Dave Martin
  2024-08-15 15:24                   ` Mark Brown
@ 2024-08-19 17:09                   ` Catalin Marinas
  2024-08-20  9:54                     ` Joey Gouly
  1 sibling, 1 reply; 146+ messages in thread
From: Catalin Marinas @ 2024-08-19 17:09 UTC (permalink / raw)
  To: Dave Martin
  Cc: Joey Gouly, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, christophe.leroy, dave.hansen, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote:
> On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> > That's a lot of words to say, or ask, do you agree with the approach of only
> > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> > 
> > Thanks,
> > Joey
> 
> ...So..., given all the above, it is perhaps best to go back to
> dumping POR_EL0 unconditionally after all, unless we have a mechanism
> to determine whether pkeys are in use at all.

Ah, I can see why checking for POR_EL0_INIT is useful. Only checking for
the allocated keys gets confusing with pkey 0.

Not sure what the deal is with pkey 0. Is it considered allocated by
default or unallocatable? If the former, it implies that pkeys are
already in use (hence the additional check for POR_EL0_INIT). In
principle the hardware allows us to use permissions where the pkeys do
not apply but we'd run out of indices and PTE bits to encode them, so I
think by default we should assume that pkey 0 is pre-allocated.

So I agree that it's probably best to save it unconditionally.

-- 
Catalin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-19 17:09                   ` Catalin Marinas
@ 2024-08-20  9:54                     ` Joey Gouly
  2024-08-20 13:54                       ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-20  9:54 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Dave Martin, linux-arm-kernel, akpm, aneesh.kumar, aneesh.kumar,
	bp, broonie, christophe.leroy, dave.hansen, hpa, linux-fsdevel,
	linux-mm, linuxppc-dev, maz, mingo, mpe, naveen.n.rao, npiggin,
	oliver.upton, shuah, szabolcs.nagy, tglx, will, x86, kvmarm

On Mon, Aug 19, 2024 at 06:09:06PM +0100, Catalin Marinas wrote:
> On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote:
> > On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> > > That's a lot of words to say, or ask, do you agree with the approach of only
> > > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> > > 
> > > Thanks,
> > > Joey
> > 
> > ...So..., given all the above, it is perhaps best to go back to
> > dumping POR_EL0 unconditionally after all, unless we have a mechanism
> > to determine whether pkeys are in use at all.
> 
> Ah, I can see why checking for POR_EL0_INIT is useful. Only checking for
> the allocated keys gets confusing with pkey 0.
> 
> Not sure what the deal is with pkey 0. Is it considered allocated by
> default or unallocatable? If the former, it implies that pkeys are
> already in use (hence the additional check for POR_EL0_INIT). In
> principle the hardware allows us to use permissions where the pkeys do
> not apply but we'd run out of indices and PTE bits to encode them, so I
> think by default we should assume that pkey 0 is pre-allocated.
> 
> 

You can consider pkey 0 allocated by default. You can actually pkey_free(0), there's nothing stopping that.

> So I agree that it's probably best to save it unconditionally.

Alright, will leave it as is!

Thanks,
Joey

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-20  9:54                     ` Joey Gouly
@ 2024-08-20 13:54                       ` Dave Martin
  2024-08-20 14:06                         ` Joey Gouly
  0 siblings, 1 reply; 146+ messages in thread
From: Dave Martin @ 2024-08-20 13:54 UTC (permalink / raw)
  To: Joey Gouly
  Cc: Catalin Marinas, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 20, 2024 at 10:54:41AM +0100, Joey Gouly wrote:
> On Mon, Aug 19, 2024 at 06:09:06PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote:
> > > On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> > > > That's a lot of words to say, or ask, do you agree with the approach of only
> > > > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> > > > 
> > > > Thanks,
> > > > Joey
> > > 
> > > ...So..., given all the above, it is perhaps best to go back to
> > > dumping POR_EL0 unconditionally after all, unless we have a mechanism
> > > to determine whether pkeys are in use at all.
> > 
> > Ah, I can see why checking for POR_EL0_INIT is useful. Only checking for
> > the allocated keys gets confusing with pkey 0.
> > 
> > Not sure what the deal is with pkey 0. Is it considered allocated by
> > default or unallocatable? If the former, it implies that pkeys are
> > already in use (hence the additional check for POR_EL0_INIT). In
> > principle the hardware allows us to use permissions where the pkeys do
> > not apply but we'd run out of indices and PTE bits to encode them, so I
> > think by default we should assume that pkey 0 is pre-allocated.
> > 
> > 
> 
> You can consider pkey 0 allocated by default. You can actually pkey_free(0), there's nothing stopping that.

Is that intentional?

You're not supposed to free pkeys that are in use, and it's quasi-
impossible to know whether pkey 0 is in use: all binaries in the
process assume that pkey is available and use it by default for their
pages, plus the stack will be painted with pkey 0, and the vDSO has to
be painted with some pkey.

Actually, that's a good point, because of the vDSO I think that only
special bits of code with a private ABI (e.g., JITted code etc.) that
definitely don't call into the vDSO can block permissions on pkey 0...
otherwise, stuff will break.

> 
> > So I agree that it's probably best to save it unconditionally.
> 
> Alright, will leave it as is!

Ack, I think the whole discussion around this has shown that there
isn't a _simple_ argument for conditionally dumping POR_EL0... so I'm
prepared to admit defeat here.

We might still try to slow down the consumption of the remaining space
with a "misc registers" record, instead of dedicating a record to
POR_EL0.  I have some thoughts on that, but if nobody cares that much
then this probably isn't worth pursuing.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-20 13:54                       ` Dave Martin
@ 2024-08-20 14:06                         ` Joey Gouly
  2024-08-20 14:45                           ` Dave Martin
  0 siblings, 1 reply; 146+ messages in thread
From: Joey Gouly @ 2024-08-20 14:06 UTC (permalink / raw)
  To: Dave Martin
  Cc: Catalin Marinas, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 20, 2024 at 02:54:50PM +0100, Dave Martin wrote:
> On Tue, Aug 20, 2024 at 10:54:41AM +0100, Joey Gouly wrote:
> > On Mon, Aug 19, 2024 at 06:09:06PM +0100, Catalin Marinas wrote:
> > > On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote:
> > > > On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> > > > > That's a lot of words to say, or ask, do you agree with the approach of only
> > > > > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> > > > > 
> > > > > Thanks,
> > > > > Joey
> > > > 
> > > > ...So..., given all the above, it is perhaps best to go back to
> > > > dumping POR_EL0 unconditionally after all, unless we have a mechanism
> > > > to determine whether pkeys are in use at all.
> > > 
> > > Ah, I can see why checking for POR_EL0_INIT is useful. Only checking for
> > > the allocated keys gets confusing with pkey 0.
> > > 
> > > Not sure what the deal is with pkey 0. Is it considered allocated by
> > > default or unallocatable? If the former, it implies that pkeys are
> > > already in use (hence the additional check for POR_EL0_INIT). In
> > > principle the hardware allows us to use permissions where the pkeys do
> > > not apply but we'd run out of indices and PTE bits to encode them, so I
> > > think by default we should assume that pkey 0 is pre-allocated.
> > > 
> > > 
> > 
> > You can consider pkey 0 allocated by default. You can actually pkey_free(0), there's nothing stopping that.
> 
> Is that intentional?

I don't really know? It's intentional from my side in that it, I allow it,
because it doesn't look like x86 or PPC block pkey_free(0).

I found this code that does pkey_free(0), but obviously it's a bit of a weird test case:

	https://github.com/ColinIanKing/stress-ng/blob/master/test/test-pkey-free.c#L29

> 
> You're not supposed to free pkeys that are in use, and it's quasi-
> impossible to know whether pkey 0 is in use: all binaries in the
> process assume that pkey is available and use it by default for their
> pages, plus the stack will be painted with pkey 0, and the vDSO has to
> be painted with some pkey.
> 
> Actually, that's a good point, because of the vDSO I think that only
> special bits of code with a private ABI (e.g., JITted code etc.) that
> definitely don't call into the vDSO can block permissions on pkey 0...
> otherwise, stuff will break.
> 
> > 
> > > So I agree that it's probably best to save it unconditionally.
> > 
> > Alright, will leave it as is!
> 
> Ack, I think the whole discussion around this has shown that there
> isn't a _simple_ argument for conditionally dumping POR_EL0... so I'm
> prepared to admit defeat here.
> 
> We might still try to slow down the consumption of the remaining space
> with a "misc registers" record, instead of dedicating a record to
> POR_EL0.  I have some thoughts on that, but if nobody cares that much
> then this probably isn't worth pursuing.
> 
> Cheers
> ---Dave
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v4 18/29] arm64: add POE signal support
  2024-08-20 14:06                         ` Joey Gouly
@ 2024-08-20 14:45                           ` Dave Martin
  0 siblings, 0 replies; 146+ messages in thread
From: Dave Martin @ 2024-08-20 14:45 UTC (permalink / raw)
  To: Joey Gouly
  Cc: Catalin Marinas, linux-arm-kernel, akpm, aneesh.kumar,
	aneesh.kumar, bp, broonie, christophe.leroy, dave.hansen, hpa,
	linux-fsdevel, linux-mm, linuxppc-dev, maz, mingo, mpe,
	naveen.n.rao, npiggin, oliver.upton, shuah, szabolcs.nagy, tglx,
	will, x86, kvmarm

On Tue, Aug 20, 2024 at 03:06:06PM +0100, Joey Gouly wrote:
> On Tue, Aug 20, 2024 at 02:54:50PM +0100, Dave Martin wrote:
> > On Tue, Aug 20, 2024 at 10:54:41AM +0100, Joey Gouly wrote:
> > > On Mon, Aug 19, 2024 at 06:09:06PM +0100, Catalin Marinas wrote:
> > > > On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote:
> > > > > On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote:
> > > > > > That's a lot of words to say, or ask, do you agree with the approach of only
> > > > > > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1?
> > > > > > 
> > > > > > Thanks,
> > > > > > Joey
> > > > > 
> > > > > ...So..., given all the above, it is perhaps best to go back to
> > > > > dumping POR_EL0 unconditionally after all, unless we have a mechanism
> > > > > to determine whether pkeys are in use at all.
> > > > 
> > > > Ah, I can see why checking for POR_EL0_INIT is useful. Only checking for
> > > > the allocated keys gets confusing with pkey 0.
> > > > 
> > > > Not sure what the deal is with pkey 0. Is it considered allocated by
> > > > default or unallocatable? If the former, it implies that pkeys are
> > > > already in use (hence the additional check for POR_EL0_INIT). In
> > > > principle the hardware allows us to use permissions where the pkeys do
> > > > not apply but we'd run out of indices and PTE bits to encode them, so I
> > > > think by default we should assume that pkey 0 is pre-allocated.
> > > > 
> > > > 
> > > 
> > > You can consider pkey 0 allocated by default. You can actually pkey_free(0), there's nothing stopping that.
> > 
> > Is that intentional?
> 
> I don't really know? It's intentional from my side in that it, I allow it,
> because it doesn't look like x86 or PPC block pkey_free(0).
> 
> I found this code that does pkey_free(0), but obviously it's a bit of a weird test case:
> 
> 	https://github.com/ColinIanKing/stress-ng/blob/master/test/test-pkey-free.c#L29

Of course, pkey 0 will still be in use for everything, and if the man
pages are to be believed, the PKRU bits for pkey 0 may no longer be
maintained after this call...

So this test is possibly a little braindead.  A clear use-case for
freeing pkey 0 would be more convincing.

In the meantime though, it makes most sense for arm64 to follow
the precedent set by other arches on this (as you did).

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 146+ messages in thread

end of thread, other threads:[~2024-08-20 14:45 UTC | newest]

Thread overview: 146+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-03 13:01 [PATCH v4 00/29] arm64: Permission Overlay Extension Joey Gouly
2024-05-03 13:01 ` [PATCH v4 01/29] powerpc/mm: add ARCH_PKEY_BITS to Kconfig Joey Gouly
2024-05-06  8:57   ` Michael Ellerman
2024-05-03 13:01 ` [PATCH v4 02/29] x86/mm: " Joey Gouly
2024-05-03 16:40   ` Dave Hansen
2024-05-03 13:01 ` [PATCH v4 03/29] mm: use ARCH_PKEY_BITS to define VM_PKEY_BITN Joey Gouly
2024-05-03 16:41   ` Dave Hansen
2024-07-15  7:53   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 04/29] arm64: disable trapping of POR_EL0 to EL2 Joey Gouly
2024-07-15  7:47   ` Anshuman Khandual
2024-07-25 15:44   ` Dave Martin
2024-08-06 10:04     ` Joey Gouly
2024-05-03 13:01 ` [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap Joey Gouly
2024-06-21 16:58   ` Catalin Marinas
2024-06-21 17:01   ` Catalin Marinas
2024-06-21 17:02     ` Catalin Marinas
2024-07-15  7:47   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 06/29] arm64: context switch POR_EL0 register Joey Gouly
2024-06-21 17:03   ` Catalin Marinas
2024-06-21 17:07   ` Catalin Marinas
2024-07-15  8:27   ` Anshuman Khandual
2024-07-16 13:21     ` Mark Brown
2024-07-18 14:16     ` Joey Gouly
2024-07-22 13:40   ` Kevin Brodsky
2024-07-25 15:46   ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 07/29] KVM: arm64: Save/restore POE registers Joey Gouly
2024-05-29 15:43   ` Marc Zyngier
2024-08-16 14:55   ` Marc Zyngier
2024-08-16 15:13     ` Joey Gouly
2024-08-16 15:32       ` Marc Zyngier
2024-05-03 13:01 ` [PATCH v4 08/29] KVM: arm64: make kvm_at() take an OP_AT_* Joey Gouly
2024-05-29 15:46   ` Marc Zyngier
2024-07-15  8:36   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 09/29] KVM: arm64: use `at s1e1a` for POE Joey Gouly
2024-05-29 15:50   ` Marc Zyngier
2024-07-15  8:45   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0 Joey Gouly
2024-06-21 17:04   ` Catalin Marinas
2024-07-15  9:13   ` Anshuman Khandual
2024-07-15 20:16   ` Mark Brown
2024-07-25 15:49   ` Dave Martin
2024-08-01 16:04     ` Joey Gouly
2024-08-01 16:31       ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 11/29] arm64: re-order MTE VM_ flags Joey Gouly
2024-06-21 17:04   ` Catalin Marinas
2024-07-15  9:21   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 12/29] arm64: add POIndex defines Joey Gouly
2024-06-21 17:05   ` Catalin Marinas
2024-07-15  9:26   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values Joey Gouly
2024-05-28  6:54   ` Amit Daniel Kachhap
2024-06-19 16:45     ` Catalin Marinas
2024-07-04 12:47       ` Joey Gouly
2024-07-08 17:22         ` Catalin Marinas
2024-07-16  9:05   ` Anshuman Khandual
2024-07-16  9:34     ` Joey Gouly
2024-07-25 15:49   ` Dave Martin
2024-08-01 10:55     ` Joey Gouly
2024-08-01 11:01       ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 14/29] arm64: mask out POIndex when modifying a PTE Joey Gouly
2024-07-16  9:10   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 15/29] arm64: handle PKEY/POE faults Joey Gouly
2024-06-21 16:57   ` Catalin Marinas
2024-07-09 13:03   ` Kevin Brodsky
2024-07-16 10:13   ` Anshuman Khandual
2024-07-25 15:57   ` Dave Martin
2024-08-01 16:01     ` Joey Gouly
2024-08-06 13:33       ` Dave Martin
2024-08-06 13:43         ` Joey Gouly
2024-08-06 14:38           ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay() Joey Gouly
2024-06-21 17:15   ` Catalin Marinas
2024-07-16 10:21   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 17/29] arm64: implement PKEYS support Joey Gouly
2024-05-28  6:55   ` Amit Daniel Kachhap
2024-05-28 11:26     ` Joey Gouly
2024-05-31 14:57   ` Szabolcs Nagy
2024-05-31 15:21     ` Joey Gouly
2024-05-31 16:27       ` Szabolcs Nagy
2024-06-17 13:40         ` Florian Weimer
2024-06-17 14:51           ` Szabolcs Nagy
2024-07-08 17:53             ` Catalin Marinas
2024-07-09  8:32               ` Szabolcs Nagy
2024-07-09  8:52                 ` Florian Weimer
2024-07-11  9:50               ` Joey Gouly
2024-07-18 14:45                 ` Szabolcs Nagy
2024-07-05 16:59   ` Catalin Marinas
2024-07-22 13:39     ` Kevin Brodsky
2024-07-09 13:07   ` Kevin Brodsky
2024-07-16 11:40     ` Anshuman Khandual
2024-07-23  4:22   ` Anshuman Khandual
2024-07-25 16:12   ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 18/29] arm64: add POE signal support Joey Gouly
2024-05-28  6:56   ` Amit Daniel Kachhap
2024-05-31 16:39     ` Mark Brown
2024-06-03  9:21       ` Amit Daniel Kachhap
2024-07-25 15:58         ` Dave Martin
2024-07-25 18:11           ` Mark Brown
2024-07-26 16:14             ` Dave Martin
2024-07-26 17:39               ` Mark Brown
2024-07-29 14:27                 ` Dave Martin
2024-07-29 14:41                   ` Mark Brown
2024-07-05 17:04   ` Catalin Marinas
2024-07-09 13:08   ` Kevin Brodsky
2024-07-22  9:16   ` Anshuman Khandual
2024-07-25 16:00   ` Dave Martin
2024-08-01 15:54     ` Joey Gouly
2024-08-01 16:22       ` Dave Martin
2024-08-06 10:35         ` Joey Gouly
2024-08-06 14:31           ` Joey Gouly
2024-08-06 15:00             ` Dave Martin
2024-08-14 15:03             ` Catalin Marinas
2024-08-15 13:18               ` Joey Gouly
2024-08-15 15:09                 ` Dave Martin
2024-08-15 15:24                   ` Mark Brown
2024-08-19 17:09                   ` Catalin Marinas
2024-08-20  9:54                     ` Joey Gouly
2024-08-20 13:54                       ` Dave Martin
2024-08-20 14:06                         ` Joey Gouly
2024-08-20 14:45                           ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 19/29] arm64: enable PKEY support for CPUs with S1POE Joey Gouly
2024-07-16 10:47   ` Anshuman Khandual
2024-07-25 15:48     ` Dave Martin
2024-07-25 16:00   ` Dave Martin
2024-05-03 13:01 ` [PATCH v4 20/29] arm64: enable POE and PIE to coexist Joey Gouly
2024-06-21 17:16   ` Catalin Marinas
2024-07-16 10:41   ` Anshuman Khandual
2024-07-16 13:46     ` Joey Gouly
2024-05-03 13:01 ` [PATCH v4 21/29] arm64/ptrace: add support for FEAT_POE Joey Gouly
2024-07-16 10:35   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig Joey Gouly
2024-07-05 17:05   ` Catalin Marinas
2024-07-09 13:08   ` Kevin Brodsky
2024-07-16 11:02   ` Anshuman Khandual
2024-05-03 13:01 ` [PATCH v4 23/29] kselftest/arm64: move get_header() Joey Gouly
2024-05-03 13:01 ` [PATCH v4 24/29] selftests: mm: move fpregs printing Joey Gouly
2024-05-03 13:01 ` [PATCH v4 25/29] selftests: mm: make protection_keys test work on arm64 Joey Gouly
2024-05-03 13:01 ` [PATCH v4 26/29] kselftest/arm64: add HWCAP test for FEAT_S1POE Joey Gouly
2024-05-03 13:01 ` [PATCH v4 27/29] kselftest/arm64: parse POE_MAGIC in a signal frame Joey Gouly
2024-05-03 13:01 ` [PATCH v4 28/29] kselftest/arm64: Add test case for POR_EL0 signal frame records Joey Gouly
2024-05-29 15:51   ` Mark Brown
2024-07-05 19:34     ` Shuah Khan
2024-07-09 13:10   ` Kevin Brodsky
2024-05-03 13:01 ` [PATCH v4 29/29] KVM: selftests: get-reg-list: add Permission Overlay registers Joey Gouly
2024-05-05 14:41 ` [PATCH v4 00/29] arm64: Permission Overlay Extension Mark Brown
2024-05-28 11:30 ` Joey Gouly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).