linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4 v2] Add KFENCE support for LoongArch
@ 2023-07-25  6:14 Enze Li
  2023-07-25  6:14 ` [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support Enze Li
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Enze Li @ 2023-07-25  6:14 UTC (permalink / raw)
  To: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov, Enze Li

Hi all,

This patchset adds KFENCE support on LoongArch.

To run the testcases, you will need to enable the following options,

-> Kernel hacking
   [*] Tracers
       [*] Support for tracing block IO actions (NEW)
   -> Kernel Testing and Coverage
      <*> KUnit - Enable support for unit tests

and then,

-> Kernel hacking
   -> Memory Debugging
      [*] KFENCE: low-overhead sampling-based memory safety error detector (NEW)
          <*> KFENCE integration test suite (NEW)

With these options enabled, KFENCE will be tested during kernel startup.
And normally, you might get the following feedback,

========================================================
[   35.326363 ] # kfence: pass:23 fail:0 skip:2 total:25
[   35.326486 ] # Totals: pass:23 fail:0 skip:2 total:25
[   35.326621 ] ok 1 kfence
========================================================

you might notice that 2 testcases have been skipped.  If you tend to run
all testcases, please enable CONFIG_INIT_ON_FREE_DEFAULT_ON, you can
find it here,

-> Security options
   -> Kernel hardening options
      -> Memory initialization
         [*] Enable heap memory zeroing on free by default

and you might get all testcases passed.
========================================================
[   35.531860 ] # kfence: pass:25 fail:0 skip:0 total:25
[   35.531999 ] # Totals: pass:25 fail:0 skip:0 total:25
[   35.532135 ] ok 1 kfence
========================================================

v2:
   * Address Huacai's comments.
   * Fix typos in commit message.

Thanks,
Enze

Enze Li (4):
  LoongArch: mm: Add page table mapped mode support
  LoongArch: Get stack without NMI when providing regs parameter
  KFENCE: Defer the assignment of the local variable addr
  LoongArch: Add KFENCE support

 arch/loongarch/Kconfig               |  1 +
 arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
 arch/loongarch/include/asm/page.h    | 19 ++++++++-
 arch/loongarch/include/asm/pgtable.h | 16 ++++++-
 arch/loongarch/kernel/stacktrace.c   | 20 ++++++---
 arch/loongarch/mm/fault.c            | 22 ++++++----
 arch/loongarch/mm/pgtable.c          |  6 +++
 mm/kfence/core.c                     |  5 ++-
 8 files changed, 133 insertions(+), 18 deletions(-)
 create mode 100644 arch/loongarch/include/asm/kfence.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support
  2023-07-25  6:14 [PATCH 0/4 v2] Add KFENCE support for LoongArch Enze Li
@ 2023-07-25  6:14 ` Enze Li
  2023-07-25  7:38   ` Huacai Chen
  2023-07-25  6:14 ` [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter Enze Li
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Enze Li @ 2023-07-25  6:14 UTC (permalink / raw)
  To: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov, Enze Li

According to LoongArch documentation online, there are two types of address
translation modes: direct mapped address translation mode (direct mapped mode)
and page table mapped address translation mode (page table mapped mode).

Currently, the upstream kernel only supports direct mapped mode.
This patch adds a function that determines whether page table mapped
mode should be used, and also adds the corresponding handler functions
for both modes.

For more details on the two modes, see [1].

[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode

Signed-off-by: Enze Li <lienze@kylinos.cn>
---
 arch/loongarch/include/asm/page.h    | 19 ++++++++++++++++++-
 arch/loongarch/include/asm/pgtable.h |  2 ++
 arch/loongarch/mm/pgtable.c          |  6 ++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/include/asm/page.h b/arch/loongarch/include/asm/page.h
index 26e8dccb6619..e43a2385b2cd 100644
--- a/arch/loongarch/include/asm/page.h
+++ b/arch/loongarch/include/asm/page.h
@@ -32,6 +32,7 @@
 
 #include <linux/kernel.h>
 #include <linux/pfn.h>
+#include <asm/cpu-features.h>
 
 /*
  * It's normally defined only for FLATMEM config but it's
@@ -84,7 +85,23 @@ typedef struct { unsigned long pgprot; } pgprot_t;
 #define sym_to_pfn(x)		__phys_to_pfn(__pa_symbol(x))
 
 #define virt_to_pfn(kaddr)	PFN_DOWN(PHYSADDR(kaddr))
-#define virt_to_page(kaddr)	pfn_to_page(virt_to_pfn(kaddr))
+
+static inline bool is_tlb_addr(unsigned long kaddr)
+{
+	if (unlikely((kaddr & GENMASK(BITS_PER_LONG - 1, cpu_vabits)) ==
+		     GENMASK(BITS_PER_LONG - 1, cpu_vabits)))
+		return true;
+	return false;
+}
+
+#define dwm_virt_to_page(kaddr)	pfn_to_page(virt_to_pfn(kaddr))
+
+#define virt_to_page(kaddr)						\
+({									\
+	is_tlb_addr((unsigned long)kaddr) ?				\
+	tlb_virt_to_page((unsigned long)kaddr) :			\
+	dwm_virt_to_page((unsigned long)kaddr);				\
+})
 
 extern int __virt_addr_valid(volatile void *kaddr);
 #define virt_addr_valid(kaddr)	__virt_addr_valid((volatile void *)(kaddr))
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index 38afeb7dd58b..98a0c98de9d1 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -353,6 +353,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
 #define PMD_T_LOG2	(__builtin_ffs(sizeof(pmd_t)) - 1)
 #define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
 
+inline struct page *tlb_virt_to_page(unsigned long kaddr);
+
 extern pgd_t swapper_pg_dir[];
 extern pgd_t invalid_pg_dir[];
 
diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
index 36a6dc0148ae..20e7425d235d 100644
--- a/arch/loongarch/mm/pgtable.c
+++ b/arch/loongarch/mm/pgtable.c
@@ -9,6 +9,12 @@
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
 
+inline struct page *tlb_virt_to_page(unsigned long kaddr)
+{
+	return pte_page(*virt_to_kpte(kaddr));
+}
+EXPORT_SYMBOL_GPL(tlb_virt_to_page);
+
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *ret, *init;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter
  2023-07-25  6:14 [PATCH 0/4 v2] Add KFENCE support for LoongArch Enze Li
  2023-07-25  6:14 ` [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support Enze Li
@ 2023-07-25  6:14 ` Enze Li
  2023-07-25  7:40   ` Huacai Chen
  2023-07-26  2:59   ` Jinyang He
  2023-07-25  6:14 ` [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr Enze Li
  2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
  3 siblings, 2 replies; 16+ messages in thread
From: Enze Li @ 2023-07-25  6:14 UTC (permalink / raw)
  To: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov, Enze Li

Currently, arch_stack_walk() can only get the full stack information
including NMI.  This is because the implementation of arch_stack_walk()
is forced to ignore the information passed by the regs parameter and use
the current stack information instead.

For some detection systems like KFENCE, only partial stack information
is needed.  In particular, the stack frame where the interrupt occurred.

To support KFENCE, this patch modifies the implementation of the
arch_stack_walk() function so that if this function is called with the
regs argument passed, it retains all the stack information in regs and
uses it to provide accurate information.

Before the patch applied, I get,
[    1.531195 ] ==================================================================
[    1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c
[    1.531442 ]
[    1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12):
[    1.532046 ]  stack_trace_save_regs+0x48/0x6c
[    1.532169 ]  kfence_report_error+0xa4/0x528
[    1.532276 ]  kfence_handle_page_fault+0x124/0x270
[    1.532388 ]  no_context+0x50/0x94
[    1.532453 ]  do_page_fault+0x1a8/0x36c
[    1.532524 ]  tlb_do_page_fault_0+0x118/0x1b4
[    1.532623 ]  test_out_of_bounds_read+0xa0/0x1d8
[    1.532745 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
[    1.532854 ]  kthread+0x124/0x130
[    1.532922 ]  ret_from_kernel_thread+0xc/0xa4
<snip>

With this patch applied, I get the correct stack information.
[    1.320220 ] ==================================================================
[    1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8
[    1.320401 ]
[    1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10):
[    1.321134 ]  test_out_of_bounds_read+0xa8/0x1d8
[    1.321264 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
[    1.321392 ]  kthread+0x124/0x130
[    1.321459 ]  ret_from_kernel_thread+0xc/0xa4
<snip>

Signed-off-by: Enze Li <lienze@kylinos.cn>
---
 arch/loongarch/kernel/stacktrace.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
index 2463d2fea21f..9dab30ae68ec 100644
--- a/arch/loongarch/kernel/stacktrace.c
+++ b/arch/loongarch/kernel/stacktrace.c
@@ -18,16 +18,24 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 	struct pt_regs dummyregs;
 	struct unwind_state state;
 
-	regs = &dummyregs;
-
 	if (task == current) {
-		regs->regs[3] = (unsigned long)__builtin_frame_address(0);
-		regs->csr_era = (unsigned long)__builtin_return_address(0);
+		if (regs)
+			memcpy(&dummyregs, regs, sizeof(*regs));
+		else {
+			dummyregs.regs[3] = (unsigned long)__builtin_frame_address(0);
+			dummyregs.csr_era = (unsigned long)__builtin_return_address(0);
+		}
 	} else {
-		regs->regs[3] = thread_saved_fp(task);
-		regs->csr_era = thread_saved_ra(task);
+		if (regs)
+			memcpy(&dummyregs, regs, sizeof(*regs));
+		else {
+			dummyregs.regs[3] = thread_saved_fp(task);
+			dummyregs.csr_era = thread_saved_ra(task);
+		}
 	}
 
+	regs = &dummyregs;
+
 	regs->regs[1] = 0;
 	for (unwind_start(&state, task, regs);
 	     !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr
  2023-07-25  6:14 [PATCH 0/4 v2] Add KFENCE support for LoongArch Enze Li
  2023-07-25  6:14 ` [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support Enze Li
  2023-07-25  6:14 ` [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter Enze Li
@ 2023-07-25  6:14 ` Enze Li
  2023-07-25  7:45   ` Huacai Chen
  2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
  3 siblings, 1 reply; 16+ messages in thread
From: Enze Li @ 2023-07-25  6:14 UTC (permalink / raw)
  To: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov, Enze Li

The LoongArch architecture is different from other architectures.
It needs to update __kfence_pool during arch_kfence_init_pool().

This patch modifies the assignment location of the local variable addr
in the kfence_init_pool function to support the case of updating
__kfence_pool in arch_kfence_init_pool().

Signed-off-by: Enze Li <lienze@kylinos.cn>
Acked-by: Marco Elver <elver@google.com>
---
 mm/kfence/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index dad3c0eb70a0..e124ffff489f 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -566,13 +566,14 @@ static void rcu_guarded_free(struct rcu_head *h)
  */
 static unsigned long kfence_init_pool(void)
 {
-	unsigned long addr = (unsigned long)__kfence_pool;
+	unsigned long addr;
 	struct page *pages;
 	int i;
 
 	if (!arch_kfence_init_pool())
-		return addr;
+		return (unsigned long)__kfence_pool;
 
+	addr = (unsigned long)__kfence_pool;
 	pages = virt_to_page(__kfence_pool);
 
 	/*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-25  6:14 [PATCH 0/4 v2] Add KFENCE support for LoongArch Enze Li
                   ` (2 preceding siblings ...)
  2023-07-25  6:14 ` [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr Enze Li
@ 2023-07-25  6:14 ` Enze Li
  2023-07-25  7:48   ` Huacai Chen
                     ` (2 more replies)
  3 siblings, 3 replies; 16+ messages in thread
From: Enze Li @ 2023-07-25  6:14 UTC (permalink / raw)
  To: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov, Enze Li

The LoongArch architecture is quite different from other architectures.
When the allocating of KFENCE itself is done, it is mapped to the direct
mapping configuration window [1] by default on LoongArch.  It means that
it is not possible to use the page table mapped mode which required by
the KFENCE system and therefore it should be remapped to the appropriate
region.

This patch adds architecture specific implementation details for KFENCE.
In particular, this implements the required interface in <asm/kfence.h>.

Tested this patch by running the testcases and all passed.

[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode

Signed-off-by: Enze Li <lienze@kylinos.cn>
---
 arch/loongarch/Kconfig               |  1 +
 arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
 arch/loongarch/include/asm/pgtable.h | 14 ++++++-
 arch/loongarch/mm/fault.c            | 22 ++++++----
 4 files changed, 90 insertions(+), 9 deletions(-)
 create mode 100644 arch/loongarch/include/asm/kfence.h

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 70635ea3d1e4..5b63b16be49e 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -91,6 +91,7 @@ config LOONGARCH
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_JUMP_LABEL_RELATIVE
+	select HAVE_ARCH_KFENCE
 	select HAVE_ARCH_MMAP_RND_BITS if MMU
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
new file mode 100644
index 000000000000..fb39076fe4d7
--- /dev/null
+++ b/arch/loongarch/include/asm/kfence.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KFENCE support for LoongArch.
+ *
+ * Author: Enze Li <lienze@kylinos.cn>
+ * Copyright (C) 2022-2023 KylinSoft Corporation.
+ */
+
+#ifndef _ASM_LOONGARCH_KFENCE_H
+#define _ASM_LOONGARCH_KFENCE_H
+
+#include <linux/kfence.h>
+#include <asm/pgtable.h>
+#include <asm/tlb.h>
+
+static inline bool arch_kfence_init_pool(void)
+{
+	char *kfence_pool = __kfence_pool;
+	struct vm_struct *area;
+	int err;
+
+	area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
+				    KFENCE_AREA_START, KFENCE_AREA_END,
+				    __builtin_return_address(0));
+	if (!area)
+		return false;
+
+	__kfence_pool = (char *)area->addr;
+	err = ioremap_page_range((unsigned long)__kfence_pool,
+				 (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
+				 virt_to_phys((void *)kfence_pool),
+				 PAGE_KERNEL);
+	if (err) {
+		free_vm_area(area);
+		return false;
+	}
+
+	return true;
+}
+
+/* Protect the given page and flush TLB. */
+static inline bool kfence_protect_page(unsigned long addr, bool protect)
+{
+	pte_t *pte = virt_to_kpte(addr);
+
+	if (WARN_ON(!pte) || pte_none(*pte))
+		return false;
+
+	if (protect)
+		set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
+	else
+		set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
+
+	/* Flush this CPU's TLB. */
+	preempt_disable();
+	local_flush_tlb_one(addr);
+	preempt_enable();
+
+	return true;
+}
+
+#endif /* _ASM_LOONGARCH_KFENCE_H */
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index 98a0c98de9d1..2702a6ba7122 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
 	(virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
 #define __HAVE_COLOR_ZERO_PAGE
 
+#ifdef CONFIG_KFENCE
+#define KFENCE_AREA_SIZE \
+	(((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
+#else
+#define KFENCE_AREA_SIZE	0
+#endif
+
 /*
  * TLB refill handlers may also map the vmalloc area into xkvrange.
  * Avoid the first couple of pages so NULL pointer dereferences will
@@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
 #define VMALLOC_START	MODULES_END
 #define VMALLOC_END	\
 	(vm_map_base +	\
-	 min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
+	 min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
 
 #define vmemmap		((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
 #define VMEMMAP_END	((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
 
+#ifdef CONFIG_KFENCE
+#define KFENCE_AREA_START	VMEMMAP_END
+#define KFENCE_AREA_END		(KFENCE_AREA_START + KFENCE_AREA_SIZE)
+#endif
+
 #define pte_ERROR(e) \
 	pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
 #ifndef __PAGETABLE_PMD_FOLDED
diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
index da5b6d518cdb..c0319128b221 100644
--- a/arch/loongarch/mm/fault.c
+++ b/arch/loongarch/mm/fault.c
@@ -23,6 +23,7 @@
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
 #include <linux/uaccess.h>
+#include <linux/kfence.h>
 
 #include <asm/branch.h>
 #include <asm/mmu_context.h>
@@ -30,7 +31,8 @@
 
 int show_unhandled_signals = 1;
 
-static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
+static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
+				 unsigned long write)
 {
 	const int field = sizeof(unsigned long) * 2;
 
@@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
 	if (fixup_exception(regs))
 		return;
 
+	if (kfence_handle_page_fault(address, write, regs))
+		return;
+
 	/*
 	 * Oops. The kernel tried to access some bad page. We'll have to
 	 * terminate things with extreme prejudice.
@@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
 	die("Oops", regs);
 }
 
-static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
+static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
+				       unsigned long write)
 {
 	/*
 	 * We ran out of memory, call the OOM killer, and return the userspace
 	 * (which will retry the fault, or kill us if we got oom-killed).
 	 */
 	if (!user_mode(regs)) {
-		no_context(regs, address);
+		no_context(regs, address, write);
 		return;
 	}
 	pagefault_out_of_memory();
@@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
 {
 	/* Kernel mode? Handle exceptions or die */
 	if (!user_mode(regs)) {
-		no_context(regs, address);
+		no_context(regs, address, write);
 		return;
 	}
 
@@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
 
 	/* Kernel mode? Handle exceptions or die */
 	if (!user_mode(regs)) {
-		no_context(regs, address);
+		no_context(regs, address, write);
 		return;
 	}
 
@@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
 	 */
 	if (address & __UA_LIMIT) {
 		if (!user_mode(regs))
-			no_context(regs, address);
+			no_context(regs, address, write);
 		else
 			do_sigsegv(regs, write, address, si_code);
 		return;
@@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
 
 	if (fault_signal_pending(fault, regs)) {
 		if (!user_mode(regs))
-			no_context(regs, address);
+			no_context(regs, address, write);
 		return;
 	}
 
@@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		mmap_read_unlock(mm);
 		if (fault & VM_FAULT_OOM) {
-			do_out_of_memory(regs, address);
+			do_out_of_memory(regs, address, write);
 			return;
 		} else if (fault & VM_FAULT_SIGSEGV) {
 			do_sigsegv(regs, write, address, si_code);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support
  2023-07-25  6:14 ` [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support Enze Li
@ 2023-07-25  7:38   ` Huacai Chen
  0 siblings, 0 replies; 16+ messages in thread
From: Huacai Chen @ 2023-07-25  7:38 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

Hi, Enze,

On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>
> According to LoongArch documentation online, there are two types of address
> translation modes: direct mapped address translation mode (direct mapped mode)
> and page table mapped address translation mode (page table mapped mode).
>
> Currently, the upstream kernel only supports direct mapped mode.
> This patch adds a function that determines whether page table mapped
> mode should be used, and also adds the corresponding handler functions
> for both modes.
>
> For more details on the two modes, see [1].
>
> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>  arch/loongarch/include/asm/page.h    | 19 ++++++++++++++++++-
>  arch/loongarch/include/asm/pgtable.h |  2 ++
>  arch/loongarch/mm/pgtable.c          |  6 ++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/include/asm/page.h b/arch/loongarch/include/asm/page.h
> index 26e8dccb6619..e43a2385b2cd 100644
> --- a/arch/loongarch/include/asm/page.h
> +++ b/arch/loongarch/include/asm/page.h
> @@ -32,6 +32,7 @@
>
>  #include <linux/kernel.h>
>  #include <linux/pfn.h>
> +#include <asm/cpu-features.h>
>
>  /*
>   * It's normally defined only for FLATMEM config but it's
> @@ -84,7 +85,23 @@ typedef struct { unsigned long pgprot; } pgprot_t;
>  #define sym_to_pfn(x)          __phys_to_pfn(__pa_symbol(x))
>
>  #define virt_to_pfn(kaddr)     PFN_DOWN(PHYSADDR(kaddr))
> -#define virt_to_page(kaddr)    pfn_to_page(virt_to_pfn(kaddr))
> +
> +static inline bool is_tlb_addr(unsigned long kaddr)
> +{
> +       if (unlikely((kaddr & GENMASK(BITS_PER_LONG - 1, cpu_vabits)) ==
> +                    GENMASK(BITS_PER_LONG - 1, cpu_vabits)))
> +               return true;
> +       return false;
I think this helper can simply "return (kaddr >= vm_map_base)"? If so,
we can even remove this helper and use the simple condition in
virt_to_page().

> +}
> +
> +#define dwm_virt_to_page(kaddr)        pfn_to_page(virt_to_pfn(kaddr))
This should be "dmw", not "dwm", and since tlb_virt_to_page is in .c
file, this one should also be there.

Huacai
> +
> +#define virt_to_page(kaddr)                                            \
> +({                                                                     \
> +       is_tlb_addr((unsigned long)kaddr) ?                             \
> +       tlb_virt_to_page((unsigned long)kaddr) :                        \
> +       dwm_virt_to_page((unsigned long)kaddr);                         \
> +})
>
>  extern int __virt_addr_valid(volatile void *kaddr);
>  #define virt_addr_valid(kaddr) __virt_addr_valid((volatile void *)(kaddr))
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 38afeb7dd58b..98a0c98de9d1 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -353,6 +353,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
>  #define PMD_T_LOG2     (__builtin_ffs(sizeof(pmd_t)) - 1)
>  #define PTE_T_LOG2     (__builtin_ffs(sizeof(pte_t)) - 1)
>
> +inline struct page *tlb_virt_to_page(unsigned long kaddr);
> +
>  extern pgd_t swapper_pg_dir[];
>  extern pgd_t invalid_pg_dir[];
>
> diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
> index 36a6dc0148ae..20e7425d235d 100644
> --- a/arch/loongarch/mm/pgtable.c
> +++ b/arch/loongarch/mm/pgtable.c
> @@ -9,6 +9,12 @@
>  #include <asm/pgtable.h>
>  #include <asm/tlbflush.h>
>
> +inline struct page *tlb_virt_to_page(unsigned long kaddr)
> +{
> +       return pte_page(*virt_to_kpte(kaddr));
> +}
> +EXPORT_SYMBOL_GPL(tlb_virt_to_page);
> +
>  pgd_t *pgd_alloc(struct mm_struct *mm)
>  {
>         pgd_t *ret, *init;
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter
  2023-07-25  6:14 ` [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter Enze Li
@ 2023-07-25  7:40   ` Huacai Chen
  2023-07-26  2:59   ` Jinyang He
  1 sibling, 0 replies; 16+ messages in thread
From: Huacai Chen @ 2023-07-25  7:40 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

Hi, Enze,

On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>
> Currently, arch_stack_walk() can only get the full stack information
> including NMI.  This is because the implementation of arch_stack_walk()
> is forced to ignore the information passed by the regs parameter and use
> the current stack information instead.
>
> For some detection systems like KFENCE, only partial stack information
> is needed.  In particular, the stack frame where the interrupt occurred.
>
> To support KFENCE, this patch modifies the implementation of the
> arch_stack_walk() function so that if this function is called with the
> regs argument passed, it retains all the stack information in regs and
> uses it to provide accurate information.
>
> Before the patch applied, I get,
> [    1.531195 ] ==================================================================
> [    1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c
> [    1.531442 ]
> [    1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12):
> [    1.532046 ]  stack_trace_save_regs+0x48/0x6c
> [    1.532169 ]  kfence_report_error+0xa4/0x528
> [    1.532276 ]  kfence_handle_page_fault+0x124/0x270
> [    1.532388 ]  no_context+0x50/0x94
> [    1.532453 ]  do_page_fault+0x1a8/0x36c
> [    1.532524 ]  tlb_do_page_fault_0+0x118/0x1b4
> [    1.532623 ]  test_out_of_bounds_read+0xa0/0x1d8
> [    1.532745 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
> [    1.532854 ]  kthread+0x124/0x130
> [    1.532922 ]  ret_from_kernel_thread+0xc/0xa4
> <snip>
>
> With this patch applied, I get the correct stack information.
> [    1.320220 ] ==================================================================
> [    1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8
> [    1.320401 ]
> [    1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10):
> [    1.321134 ]  test_out_of_bounds_read+0xa8/0x1d8
> [    1.321264 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
> [    1.321392 ]  kthread+0x124/0x130
> [    1.321459 ]  ret_from_kernel_thread+0xc/0xa4
> <snip>
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>  arch/loongarch/kernel/stacktrace.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
> index 2463d2fea21f..9dab30ae68ec 100644
> --- a/arch/loongarch/kernel/stacktrace.c
> +++ b/arch/loongarch/kernel/stacktrace.c
> @@ -18,16 +18,24 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
>         struct pt_regs dummyregs;
>         struct unwind_state state;
>
> -       regs = &dummyregs;
We can move the 'if (regs)' logic here and simplify the whole function.

Huacai
> -
>         if (task == current) {
> -               regs->regs[3] = (unsigned long)__builtin_frame_address(0);
> -               regs->csr_era = (unsigned long)__builtin_return_address(0);
> +               if (regs)
> +                       memcpy(&dummyregs, regs, sizeof(*regs));
> +               else {
> +                       dummyregs.regs[3] = (unsigned long)__builtin_frame_address(0);
> +                       dummyregs.csr_era = (unsigned long)__builtin_return_address(0);
> +               }
>         } else {
> -               regs->regs[3] = thread_saved_fp(task);
> -               regs->csr_era = thread_saved_ra(task);
> +               if (regs)
> +                       memcpy(&dummyregs, regs, sizeof(*regs));
> +               else {
> +                       dummyregs.regs[3] = thread_saved_fp(task);
> +                       dummyregs.csr_era = thread_saved_ra(task);
> +               }
>         }
>
> +       regs = &dummyregs;
> +
>         regs->regs[1] = 0;
>         for (unwind_start(&state, task, regs);
>              !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr
  2023-07-25  6:14 ` [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr Enze Li
@ 2023-07-25  7:45   ` Huacai Chen
  0 siblings, 0 replies; 16+ messages in thread
From: Huacai Chen @ 2023-07-25  7:45 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

Hi, Enze,

This patch is a preparation for LoongArch KFENCE support, so it is
better to move to the first patch.

Huacai

On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>
> The LoongArch architecture is different from other architectures.
> It needs to update __kfence_pool during arch_kfence_init_pool().
>
> This patch modifies the assignment location of the local variable addr
> in the kfence_init_pool function to support the case of updating
> __kfence_pool in arch_kfence_init_pool().
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> Acked-by: Marco Elver <elver@google.com>
> ---
>  mm/kfence/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index dad3c0eb70a0..e124ffff489f 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -566,13 +566,14 @@ static void rcu_guarded_free(struct rcu_head *h)
>   */
>  static unsigned long kfence_init_pool(void)
>  {
> -       unsigned long addr = (unsigned long)__kfence_pool;
> +       unsigned long addr;
>         struct page *pages;
>         int i;
>
>         if (!arch_kfence_init_pool())
> -               return addr;
> +               return (unsigned long)__kfence_pool;
>
> +       addr = (unsigned long)__kfence_pool;
>         pages = virt_to_page(__kfence_pool);
>
>         /*
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
@ 2023-07-25  7:48   ` Huacai Chen
  2023-07-25 14:34   ` Jackie Liu
  2023-07-27  1:26   ` Huacai Chen
  2 siblings, 0 replies; 16+ messages in thread
From: Huacai Chen @ 2023-07-25  7:48 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

Hi, Enze,

On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>
> The LoongArch architecture is quite different from other architectures.
> When the allocating of KFENCE itself is done, it is mapped to the direct
> mapping configuration window [1] by default on LoongArch.  It means that
> it is not possible to use the page table mapped mode which required by
> the KFENCE system and therefore it should be remapped to the appropriate
> region.
>
> This patch adds architecture specific implementation details for KFENCE.
> In particular, this implements the required interface in <asm/kfence.h>.
>
> Tested this patch by running the testcases and all passed.
>
> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>  arch/loongarch/Kconfig               |  1 +
>  arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
>  arch/loongarch/include/asm/pgtable.h | 14 ++++++-
>  arch/loongarch/mm/fault.c            | 22 ++++++----
>  4 files changed, 90 insertions(+), 9 deletions(-)
>  create mode 100644 arch/loongarch/include/asm/kfence.h
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 70635ea3d1e4..5b63b16be49e 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -91,6 +91,7 @@ config LOONGARCH
>         select HAVE_ARCH_AUDITSYSCALL
>         select HAVE_ARCH_JUMP_LABEL
>         select HAVE_ARCH_JUMP_LABEL_RELATIVE
> +       select HAVE_ARCH_KFENCE
>         select HAVE_ARCH_MMAP_RND_BITS if MMU
>         select HAVE_ARCH_SECCOMP_FILTER
>         select HAVE_ARCH_TRACEHOOK
> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
> new file mode 100644
> index 000000000000..fb39076fe4d7
> --- /dev/null
> +++ b/arch/loongarch/include/asm/kfence.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KFENCE support for LoongArch.
> + *
> + * Author: Enze Li <lienze@kylinos.cn>
> + * Copyright (C) 2022-2023 KylinSoft Corporation.
> + */
> +
> +#ifndef _ASM_LOONGARCH_KFENCE_H
> +#define _ASM_LOONGARCH_KFENCE_H
> +
> +#include <linux/kfence.h>
> +#include <asm/pgtable.h>
> +#include <asm/tlb.h>
> +
> +static inline bool arch_kfence_init_pool(void)
> +{
> +       char *kfence_pool = __kfence_pool;
> +       struct vm_struct *area;
> +       int err;
> +
> +       area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
> +                                   KFENCE_AREA_START, KFENCE_AREA_END,
> +                                   __builtin_return_address(0));
> +       if (!area)
> +               return false;
> +
> +       __kfence_pool = (char *)area->addr;
> +       err = ioremap_page_range((unsigned long)__kfence_pool,
> +                                (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
> +                                virt_to_phys((void *)kfence_pool),
> +                                PAGE_KERNEL);
> +       if (err) {
> +               free_vm_area(area);
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +/* Protect the given page and flush TLB. */
> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> +{
> +       pte_t *pte = virt_to_kpte(addr);
> +
> +       if (WARN_ON(!pte) || pte_none(*pte))
> +               return false;
> +
> +       if (protect)
> +               set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
> +       else
> +               set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
> +
> +       /* Flush this CPU's TLB. */
This comment can be removed since the logic is obvious.

> +       preempt_disable();
> +       local_flush_tlb_one(addr);
> +       preempt_enable();
> +
> +       return true;
> +}
> +
> +#endif /* _ASM_LOONGARCH_KFENCE_H */
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 98a0c98de9d1..2702a6ba7122 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
>         (virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
>  #define __HAVE_COLOR_ZERO_PAGE
>
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_SIZE \
> +       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
Needn't change to a new line.

Others look good to me.

Huacai

> +#else
> +#define KFENCE_AREA_SIZE       0
> +#endif
> +
>  /*
>   * TLB refill handlers may also map the vmalloc area into xkvrange.
>   * Avoid the first couple of pages so NULL pointer dereferences will
> @@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
>  #define VMALLOC_START  MODULES_END
>  #define VMALLOC_END    \
>         (vm_map_base +  \
> -        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
> +        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
>
>  #define vmemmap                ((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
>  #define VMEMMAP_END    ((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
>
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_START      VMEMMAP_END
> +#define KFENCE_AREA_END                (KFENCE_AREA_START + KFENCE_AREA_SIZE)
> +#endif
> +
>  #define pte_ERROR(e) \
>         pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
>  #ifndef __PAGETABLE_PMD_FOLDED
> diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
> index da5b6d518cdb..c0319128b221 100644
> --- a/arch/loongarch/mm/fault.c
> +++ b/arch/loongarch/mm/fault.c
> @@ -23,6 +23,7 @@
>  #include <linux/kprobes.h>
>  #include <linux/perf_event.h>
>  #include <linux/uaccess.h>
> +#include <linux/kfence.h>
>
>  #include <asm/branch.h>
>  #include <asm/mmu_context.h>
> @@ -30,7 +31,8 @@
>
>  int show_unhandled_signals = 1;
>
> -static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> +static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
> +                                unsigned long write)
>  {
>         const int field = sizeof(unsigned long) * 2;
>
> @@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>         if (fixup_exception(regs))
>                 return;
>
> +       if (kfence_handle_page_fault(address, write, regs))
> +               return;
> +
>         /*
>          * Oops. The kernel tried to access some bad page. We'll have to
>          * terminate things with extreme prejudice.
> @@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>         die("Oops", regs);
>  }
>
> -static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
> +static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
> +                                      unsigned long write)
>  {
>         /*
>          * We ran out of memory, call the OOM killer, and return the userspace
>          * (which will retry the fault, or kill us if we got oom-killed).
>          */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>         pagefault_out_of_memory();
> @@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
>  {
>         /* Kernel mode? Handle exceptions or die */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>
> @@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
>
>         /* Kernel mode? Handle exceptions or die */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>
> @@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>          */
>         if (address & __UA_LIMIT) {
>                 if (!user_mode(regs))
> -                       no_context(regs, address);
> +                       no_context(regs, address, write);
>                 else
>                         do_sigsegv(regs, write, address, si_code);
>                 return;
> @@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>
>         if (fault_signal_pending(fault, regs)) {
>                 if (!user_mode(regs))
> -                       no_context(regs, address);
> +                       no_context(regs, address, write);
>                 return;
>         }
>
> @@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>         if (unlikely(fault & VM_FAULT_ERROR)) {
>                 mmap_read_unlock(mm);
>                 if (fault & VM_FAULT_OOM) {
> -                       do_out_of_memory(regs, address);
> +                       do_out_of_memory(regs, address, write);
>                         return;
>                 } else if (fault & VM_FAULT_SIGSEGV) {
>                         do_sigsegv(regs, write, address, si_code);
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
  2023-07-25  7:48   ` Huacai Chen
@ 2023-07-25 14:34   ` Jackie Liu
  2023-07-28  6:01     ` Enze Li
  2023-07-27  1:26   ` Huacai Chen
  2 siblings, 1 reply; 16+ messages in thread
From: Jackie Liu @ 2023-07-25 14:34 UTC (permalink / raw)
  To: Enze Li, chenhuacai, kernel, loongarch, glider, elver, akpm,
	kasan-dev, linux-mm
  Cc: zhangqing, yangtiezhu, dvyukov



在 2023/7/25 14:14, Enze Li 写道:
> The LoongArch architecture is quite different from other architectures.
> When the allocating of KFENCE itself is done, it is mapped to the direct
> mapping configuration window [1] by default on LoongArch.  It means that
> it is not possible to use the page table mapped mode which required by
> the KFENCE system and therefore it should be remapped to the appropriate
> region.
>
> This patch adds architecture specific implementation details for KFENCE.
> In particular, this implements the required interface in <asm/kfence.h>.
>
> Tested this patch by running the testcases and all passed.
>
> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>   arch/loongarch/Kconfig               |  1 +
>   arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
>   arch/loongarch/include/asm/pgtable.h | 14 ++++++-
>   arch/loongarch/mm/fault.c            | 22 ++++++----
>   4 files changed, 90 insertions(+), 9 deletions(-)
>   create mode 100644 arch/loongarch/include/asm/kfence.h
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 70635ea3d1e4..5b63b16be49e 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -91,6 +91,7 @@ config LOONGARCH
>   	select HAVE_ARCH_AUDITSYSCALL
>   	select HAVE_ARCH_JUMP_LABEL
>   	select HAVE_ARCH_JUMP_LABEL_RELATIVE
> +	select HAVE_ARCH_KFENCE
>   	select HAVE_ARCH_MMAP_RND_BITS if MMU
>   	select HAVE_ARCH_SECCOMP_FILTER
>   	select HAVE_ARCH_TRACEHOOK
> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
> new file mode 100644
> index 000000000000..fb39076fe4d7
> --- /dev/null
> +++ b/arch/loongarch/include/asm/kfence.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KFENCE support for LoongArch.
> + *
> + * Author: Enze Li <lienze@kylinos.cn>
> + * Copyright (C) 2022-2023 KylinSoft Corporation.
> + */
> +
> +#ifndef _ASM_LOONGARCH_KFENCE_H
> +#define _ASM_LOONGARCH_KFENCE_H
> +
> +#include <linux/kfence.h>
> +#include <asm/pgtable.h>
> +#include <asm/tlb.h>
> +
> +static inline bool arch_kfence_init_pool(void)
> +{
> +	char *kfence_pool = __kfence_pool;
> +	struct vm_struct *area;
> +	int err;
> +
> +	area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
> +				    KFENCE_AREA_START, KFENCE_AREA_END,
> +				    __builtin_return_address(0));
> +	if (!area)
> +		return false;
> +
> +	__kfence_pool = (char *)area->addr;

I think there should be something wrong here.

> +	err = ioremap_page_range((unsigned long)__kfence_pool,
> +				 (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
> +				 virt_to_phys((void *)kfence_pool),
> +				 PAGE_KERNEL);
> +	if (err) {
> +		free_vm_area(area);

If err > 0, return area->addr here, It's not correct.

-- 
Jackie Liu

> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +/* Protect the given page and flush TLB. */
> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> +{
> +	pte_t *pte = virt_to_kpte(addr);
> +
> +	if (WARN_ON(!pte) || pte_none(*pte))
> +		return false;
> +
> +	if (protect)
> +		set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
> +	else
> +		set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
> +
> +	/* Flush this CPU's TLB. */
> +	preempt_disable();
> +	local_flush_tlb_one(addr);
> +	preempt_enable();
> +
> +	return true;
> +}
> +
> +#endif /* _ASM_LOONGARCH_KFENCE_H */
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 98a0c98de9d1..2702a6ba7122 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
>   	(virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
>   #define __HAVE_COLOR_ZERO_PAGE
>   
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_SIZE \
> +	(((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
> +#else
> +#define KFENCE_AREA_SIZE	0
> +#endif
> +
>   /*
>    * TLB refill handlers may also map the vmalloc area into xkvrange.
>    * Avoid the first couple of pages so NULL pointer dereferences will
> @@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
>   #define VMALLOC_START	MODULES_END
>   #define VMALLOC_END	\
>   	(vm_map_base +	\
> -	 min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
> +	 min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
>   
>   #define vmemmap		((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
>   #define VMEMMAP_END	((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
>   
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_START	VMEMMAP_END
> +#define KFENCE_AREA_END		(KFENCE_AREA_START + KFENCE_AREA_SIZE)
> +#endif
> +
>   #define pte_ERROR(e) \
>   	pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
>   #ifndef __PAGETABLE_PMD_FOLDED
> diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
> index da5b6d518cdb..c0319128b221 100644
> --- a/arch/loongarch/mm/fault.c
> +++ b/arch/loongarch/mm/fault.c
> @@ -23,6 +23,7 @@
>   #include <linux/kprobes.h>
>   #include <linux/perf_event.h>
>   #include <linux/uaccess.h>
> +#include <linux/kfence.h>
>   
>   #include <asm/branch.h>
>   #include <asm/mmu_context.h>
> @@ -30,7 +31,8 @@
>   
>   int show_unhandled_signals = 1;
>   
> -static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> +static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
> +				 unsigned long write)
>   {
>   	const int field = sizeof(unsigned long) * 2;
>   
> @@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>   	if (fixup_exception(regs))
>   		return;
>   
> +	if (kfence_handle_page_fault(address, write, regs))
> +		return;
> +
>   	/*
>   	 * Oops. The kernel tried to access some bad page. We'll have to
>   	 * terminate things with extreme prejudice.
> @@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>   	die("Oops", regs);
>   }
>   
> -static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
> +static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
> +				       unsigned long write)
>   {
>   	/*
>   	 * We ran out of memory, call the OOM killer, and return the userspace
>   	 * (which will retry the fault, or kill us if we got oom-killed).
>   	 */
>   	if (!user_mode(regs)) {
> -		no_context(regs, address);
> +		no_context(regs, address, write);
>   		return;
>   	}
>   	pagefault_out_of_memory();
> @@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
>   {
>   	/* Kernel mode? Handle exceptions or die */
>   	if (!user_mode(regs)) {
> -		no_context(regs, address);
> +		no_context(regs, address, write);
>   		return;
>   	}
>   
> @@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
>   
>   	/* Kernel mode? Handle exceptions or die */
>   	if (!user_mode(regs)) {
> -		no_context(regs, address);
> +		no_context(regs, address, write);
>   		return;
>   	}
>   
> @@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>   	 */
>   	if (address & __UA_LIMIT) {
>   		if (!user_mode(regs))
> -			no_context(regs, address);
> +			no_context(regs, address, write);
>   		else
>   			do_sigsegv(regs, write, address, si_code);
>   		return;
> @@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>   
>   	if (fault_signal_pending(fault, regs)) {
>   		if (!user_mode(regs))
> -			no_context(regs, address);
> +			no_context(regs, address, write);
>   		return;
>   	}
>   
> @@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>   	if (unlikely(fault & VM_FAULT_ERROR)) {
>   		mmap_read_unlock(mm);
>   		if (fault & VM_FAULT_OOM) {
> -			do_out_of_memory(regs, address);
> +			do_out_of_memory(regs, address, write);
>   			return;
>   		} else if (fault & VM_FAULT_SIGSEGV) {
>   			do_sigsegv(regs, write, address, si_code);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter
  2023-07-25  6:14 ` [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter Enze Li
  2023-07-25  7:40   ` Huacai Chen
@ 2023-07-26  2:59   ` Jinyang He
  2023-07-28  6:57     ` Enze Li
  1 sibling, 1 reply; 16+ messages in thread
From: Jinyang He @ 2023-07-26  2:59 UTC (permalink / raw)
  To: Enze Li, chenhuacai, kernel, loongarch, glider, elver, akpm,
	kasan-dev, linux-mm
  Cc: yangtiezhu, dvyukov

On 2023-07-25 14:14, Enze Li wrote:

> Currently, arch_stack_walk() can only get the full stack information
> including NMI.  This is because the implementation of arch_stack_walk()
> is forced to ignore the information passed by the regs parameter and use
> the current stack information instead.
>
> For some detection systems like KFENCE, only partial stack information
> is needed.  In particular, the stack frame where the interrupt occurred.
>
> To support KFENCE, this patch modifies the implementation of the
> arch_stack_walk() function so that if this function is called with the
> regs argument passed, it retains all the stack information in regs and
> uses it to provide accurate information.
>
> Before the patch applied, I get,
> [    1.531195 ] ==================================================================
> [    1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c
> [    1.531442 ]
> [    1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12):
> [    1.532046 ]  stack_trace_save_regs+0x48/0x6c
> [    1.532169 ]  kfence_report_error+0xa4/0x528
> [    1.532276 ]  kfence_handle_page_fault+0x124/0x270
> [    1.532388 ]  no_context+0x50/0x94
> [    1.532453 ]  do_page_fault+0x1a8/0x36c
> [    1.532524 ]  tlb_do_page_fault_0+0x118/0x1b4
> [    1.532623 ]  test_out_of_bounds_read+0xa0/0x1d8
> [    1.532745 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
> [    1.532854 ]  kthread+0x124/0x130
> [    1.532922 ]  ret_from_kernel_thread+0xc/0xa4
> <snip>
>
> With this patch applied, I get the correct stack information.
> [    1.320220 ] ==================================================================
> [    1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8
> [    1.320401 ]
> [    1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10):
> [    1.321134 ]  test_out_of_bounds_read+0xa8/0x1d8
> [    1.321264 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
> [    1.321392 ]  kthread+0x124/0x130
> [    1.321459 ]  ret_from_kernel_thread+0xc/0xa4
> <snip>
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>   arch/loongarch/kernel/stacktrace.c | 20 ++++++++++++++------
>   1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
> index 2463d2fea21f..9dab30ae68ec 100644
> --- a/arch/loongarch/kernel/stacktrace.c
> +++ b/arch/loongarch/kernel/stacktrace.c
> @@ -18,16 +18,24 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
>   	struct pt_regs dummyregs;
>   	struct unwind_state state;
>   
> -	regs = &dummyregs;
> -
>   	if (task == current) {
> -		regs->regs[3] = (unsigned long)__builtin_frame_address(0);
> -		regs->csr_era = (unsigned long)__builtin_return_address(0);
> +		if (regs)
> +			memcpy(&dummyregs, regs, sizeof(*regs));
> +		else {
> +			dummyregs.regs[3] = (unsigned long)__builtin_frame_address(0);
> +			dummyregs.csr_era = (unsigned long)__builtin_return_address(0);
> +		}
>   	} else {
> -		regs->regs[3] = thread_saved_fp(task);
> -		regs->csr_era = thread_saved_ra(task);
> +		if (regs)
> +			memcpy(&dummyregs, regs, sizeof(*regs));
> +		else {
> +			dummyregs.regs[3] = thread_saved_fp(task);
> +			dummyregs.csr_era = thread_saved_ra(task);
> +		}
>   	}
>   
> +	regs = &dummyregs;
> +

if (!regs) {
     regs = &dummyregs;

     if (task == current) {
         regs->regs[3] = (unsigned long)__builtin_frame_address(0);
         regs->csr_era = (unsigned long)__builtin_return_address(0);
     } else {
         regs->regs[3] = thread_saved_fp(task);
         regs->csr_era = thread_saved_ra(task);
     }
     regs->regs[1] = 0;
}

BTW, I remembered that __unwind_start() deals with this issue in regs,
task and current. arch_stack_walk() is unnecessary to provide current
or task regs if we fix the unwind_start() skip its parent frame
(caller is arch_stack_walk). But the current state is better, I think.


Thanks,

Jinyang

>   	regs->regs[1] = 0;
>   	for (unwind_start(&state, task, regs);
>   	     !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
  2023-07-25  7:48   ` Huacai Chen
  2023-07-25 14:34   ` Jackie Liu
@ 2023-07-27  1:26   ` Huacai Chen
  2023-07-28  3:27     ` Enze Li
  2 siblings, 1 reply; 16+ messages in thread
From: Huacai Chen @ 2023-07-27  1:26 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>
> The LoongArch architecture is quite different from other architectures.
> When the allocating of KFENCE itself is done, it is mapped to the direct
> mapping configuration window [1] by default on LoongArch.  It means that
> it is not possible to use the page table mapped mode which required by
> the KFENCE system and therefore it should be remapped to the appropriate
> region.
>
> This patch adds architecture specific implementation details for KFENCE.
> In particular, this implements the required interface in <asm/kfence.h>.
>
> Tested this patch by running the testcases and all passed.
>
> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>
> Signed-off-by: Enze Li <lienze@kylinos.cn>
> ---
>  arch/loongarch/Kconfig               |  1 +
>  arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
>  arch/loongarch/include/asm/pgtable.h | 14 ++++++-
>  arch/loongarch/mm/fault.c            | 22 ++++++----
>  4 files changed, 90 insertions(+), 9 deletions(-)
>  create mode 100644 arch/loongarch/include/asm/kfence.h
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 70635ea3d1e4..5b63b16be49e 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -91,6 +91,7 @@ config LOONGARCH
>         select HAVE_ARCH_AUDITSYSCALL
>         select HAVE_ARCH_JUMP_LABEL
>         select HAVE_ARCH_JUMP_LABEL_RELATIVE
> +       select HAVE_ARCH_KFENCE
>         select HAVE_ARCH_MMAP_RND_BITS if MMU
>         select HAVE_ARCH_SECCOMP_FILTER
>         select HAVE_ARCH_TRACEHOOK
> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
> new file mode 100644
> index 000000000000..fb39076fe4d7
> --- /dev/null
> +++ b/arch/loongarch/include/asm/kfence.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KFENCE support for LoongArch.
> + *
> + * Author: Enze Li <lienze@kylinos.cn>
> + * Copyright (C) 2022-2023 KylinSoft Corporation.
> + */
> +
> +#ifndef _ASM_LOONGARCH_KFENCE_H
> +#define _ASM_LOONGARCH_KFENCE_H
> +
> +#include <linux/kfence.h>
> +#include <asm/pgtable.h>
> +#include <asm/tlb.h>
> +
> +static inline bool arch_kfence_init_pool(void)
> +{
> +       char *kfence_pool = __kfence_pool;
> +       struct vm_struct *area;
> +       int err;
> +
> +       area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
> +                                   KFENCE_AREA_START, KFENCE_AREA_END,
> +                                   __builtin_return_address(0));
> +       if (!area)
> +               return false;
> +
> +       __kfence_pool = (char *)area->addr;
> +       err = ioremap_page_range((unsigned long)__kfence_pool,
> +                                (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
> +                                virt_to_phys((void *)kfence_pool),
> +                                PAGE_KERNEL);
> +       if (err) {
> +               free_vm_area(area);
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
> +/* Protect the given page and flush TLB. */
> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> +{
> +       pte_t *pte = virt_to_kpte(addr);
> +
> +       if (WARN_ON(!pte) || pte_none(*pte))
> +               return false;
> +
> +       if (protect)
> +               set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
> +       else
> +               set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
> +
> +       /* Flush this CPU's TLB. */
> +       preempt_disable();
> +       local_flush_tlb_one(addr);
> +       preempt_enable();
> +
> +       return true;
> +}
> +
> +#endif /* _ASM_LOONGARCH_KFENCE_H */
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 98a0c98de9d1..2702a6ba7122 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
>         (virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
>  #define __HAVE_COLOR_ZERO_PAGE
>
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_SIZE \
> +       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
Another question: Why define KFENCE_AREA_SIZE while there is already
KFENCE_POOL_SIZE? And why is KFENCE_AREA_SIZE a little larger than
KFENCE_POOL_SIZE? If we can reuse KFENCE_POOL_SIZE,
KFENCE_AREA_START/KFENCE_AREA_END can be renamed to
KFENCE_POOL_START/KFENCE_POOL_END.

Huacai

> +#else
> +#define KFENCE_AREA_SIZE       0
> +#endif
> +
>  /*
>   * TLB refill handlers may also map the vmalloc area into xkvrange.
>   * Avoid the first couple of pages so NULL pointer dereferences will
> @@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
>  #define VMALLOC_START  MODULES_END
>  #define VMALLOC_END    \
>         (vm_map_base +  \
> -        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
> +        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
>
>  #define vmemmap                ((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
>  #define VMEMMAP_END    ((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
>
> +#ifdef CONFIG_KFENCE
> +#define KFENCE_AREA_START      VMEMMAP_END
> +#define KFENCE_AREA_END                (KFENCE_AREA_START + KFENCE_AREA_SIZE)
> +#endif
> +
>  #define pte_ERROR(e) \
>         pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
>  #ifndef __PAGETABLE_PMD_FOLDED
> diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
> index da5b6d518cdb..c0319128b221 100644
> --- a/arch/loongarch/mm/fault.c
> +++ b/arch/loongarch/mm/fault.c
> @@ -23,6 +23,7 @@
>  #include <linux/kprobes.h>
>  #include <linux/perf_event.h>
>  #include <linux/uaccess.h>
> +#include <linux/kfence.h>
>
>  #include <asm/branch.h>
>  #include <asm/mmu_context.h>
> @@ -30,7 +31,8 @@
>
>  int show_unhandled_signals = 1;
>
> -static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> +static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
> +                                unsigned long write)
>  {
>         const int field = sizeof(unsigned long) * 2;
>
> @@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>         if (fixup_exception(regs))
>                 return;
>
> +       if (kfence_handle_page_fault(address, write, regs))
> +               return;
> +
>         /*
>          * Oops. The kernel tried to access some bad page. We'll have to
>          * terminate things with extreme prejudice.
> @@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>         die("Oops", regs);
>  }
>
> -static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
> +static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
> +                                      unsigned long write)
>  {
>         /*
>          * We ran out of memory, call the OOM killer, and return the userspace
>          * (which will retry the fault, or kill us if we got oom-killed).
>          */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>         pagefault_out_of_memory();
> @@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
>  {
>         /* Kernel mode? Handle exceptions or die */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>
> @@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
>
>         /* Kernel mode? Handle exceptions or die */
>         if (!user_mode(regs)) {
> -               no_context(regs, address);
> +               no_context(regs, address, write);
>                 return;
>         }
>
> @@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>          */
>         if (address & __UA_LIMIT) {
>                 if (!user_mode(regs))
> -                       no_context(regs, address);
> +                       no_context(regs, address, write);
>                 else
>                         do_sigsegv(regs, write, address, si_code);
>                 return;
> @@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>
>         if (fault_signal_pending(fault, regs)) {
>                 if (!user_mode(regs))
> -                       no_context(regs, address);
> +                       no_context(regs, address, write);
>                 return;
>         }
>
> @@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>         if (unlikely(fault & VM_FAULT_ERROR)) {
>                 mmap_read_unlock(mm);
>                 if (fault & VM_FAULT_OOM) {
> -                       do_out_of_memory(regs, address);
> +                       do_out_of_memory(regs, address, write);
>                         return;
>                 } else if (fault & VM_FAULT_SIGSEGV) {
>                         do_sigsegv(regs, write, address, si_code);
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-27  1:26   ` Huacai Chen
@ 2023-07-28  3:27     ` Enze Li
  2023-07-28  4:33       ` Huacai Chen
  0 siblings, 1 reply; 16+ messages in thread
From: Enze Li @ 2023-07-28  3:27 UTC (permalink / raw)
  To: Huacai Chen
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

On Thu, Jul 27 2023 at 09:26:04 AM +0800, Huacai Chen wrote:

> On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
>>
>> The LoongArch architecture is quite different from other architectures.
>> When the allocating of KFENCE itself is done, it is mapped to the direct
>> mapping configuration window [1] by default on LoongArch.  It means that
>> it is not possible to use the page table mapped mode which required by
>> the KFENCE system and therefore it should be remapped to the appropriate
>> region.
>>
>> This patch adds architecture specific implementation details for KFENCE.
>> In particular, this implements the required interface in <asm/kfence.h>.
>>
>> Tested this patch by running the testcases and all passed.
>>
>> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>>
>> Signed-off-by: Enze Li <lienze@kylinos.cn>
>> ---
>>  arch/loongarch/Kconfig               |  1 +
>>  arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
>>  arch/loongarch/include/asm/pgtable.h | 14 ++++++-
>>  arch/loongarch/mm/fault.c            | 22 ++++++----
>>  4 files changed, 90 insertions(+), 9 deletions(-)
>>  create mode 100644 arch/loongarch/include/asm/kfence.h
>>
>> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
>> index 70635ea3d1e4..5b63b16be49e 100644
>> --- a/arch/loongarch/Kconfig
>> +++ b/arch/loongarch/Kconfig
>> @@ -91,6 +91,7 @@ config LOONGARCH
>>         select HAVE_ARCH_AUDITSYSCALL
>>         select HAVE_ARCH_JUMP_LABEL
>>         select HAVE_ARCH_JUMP_LABEL_RELATIVE
>> +       select HAVE_ARCH_KFENCE
>>         select HAVE_ARCH_MMAP_RND_BITS if MMU
>>         select HAVE_ARCH_SECCOMP_FILTER
>>         select HAVE_ARCH_TRACEHOOK
>> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
>> new file mode 100644
>> index 000000000000..fb39076fe4d7
>> --- /dev/null
>> +++ b/arch/loongarch/include/asm/kfence.h
>> @@ -0,0 +1,62 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * KFENCE support for LoongArch.
>> + *
>> + * Author: Enze Li <lienze@kylinos.cn>
>> + * Copyright (C) 2022-2023 KylinSoft Corporation.
>> + */
>> +
>> +#ifndef _ASM_LOONGARCH_KFENCE_H
>> +#define _ASM_LOONGARCH_KFENCE_H
>> +
>> +#include <linux/kfence.h>
>> +#include <asm/pgtable.h>
>> +#include <asm/tlb.h>
>> +
>> +static inline bool arch_kfence_init_pool(void)
>> +{
>> +       char *kfence_pool = __kfence_pool;
>> +       struct vm_struct *area;
>> +       int err;
>> +
>> +       area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
>> +                                   KFENCE_AREA_START, KFENCE_AREA_END,
>> +                                   __builtin_return_address(0));
>> +       if (!area)
>> +               return false;
>> +
>> +       __kfence_pool = (char *)area->addr;
>> +       err = ioremap_page_range((unsigned long)__kfence_pool,
>> +                                (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
>> +                                virt_to_phys((void *)kfence_pool),
>> +                                PAGE_KERNEL);
>> +       if (err) {
>> +               free_vm_area(area);
>> +               return false;
>> +       }
>> +
>> +       return true;
>> +}
>> +
>> +/* Protect the given page and flush TLB. */
>> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
>> +{
>> +       pte_t *pte = virt_to_kpte(addr);
>> +
>> +       if (WARN_ON(!pte) || pte_none(*pte))
>> +               return false;
>> +
>> +       if (protect)
>> +               set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
>> +       else
>> +               set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
>> +
>> +       /* Flush this CPU's TLB. */
>> +       preempt_disable();
>> +       local_flush_tlb_one(addr);
>> +       preempt_enable();
>> +
>> +       return true;
>> +}
>> +
>> +#endif /* _ASM_LOONGARCH_KFENCE_H */
>> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
>> index 98a0c98de9d1..2702a6ba7122 100644
>> --- a/arch/loongarch/include/asm/pgtable.h
>> +++ b/arch/loongarch/include/asm/pgtable.h
>> @@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
>>         (virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
>>  #define __HAVE_COLOR_ZERO_PAGE
>>
>> +#ifdef CONFIG_KFENCE
>> +#define KFENCE_AREA_SIZE \
>> +       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)

Hi Huacai,

> Another question: Why define KFENCE_AREA_SIZE while there is already
> KFENCE_POOL_SIZE?

The KFENCE_POOL_SIZE macro is defined in linux/kfence.h.  When I trying
to include this header file, I see the following error,

----------------------------------------------------------------------
  CC      arch/loongarch/kernel/asm-offsets.s
In file included from ./arch/loongarch/include/asm/pgtable.h:64,
                 from ./include/linux/pgtable.h:6,
                 from ./include/linux/mm.h:29,
                 from arch/loongarch/kernel/asm-offsets.c:9:
./include/linux/kfence.h:93:35: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
   93 | void kfence_shutdown_cache(struct kmem_cache *s);
      |                                   ^~~~~~~~~~
./include/linux/kfence.h:99:29: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
   99 | void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags);
      |                             ^~~~~~~~~~
./include/linux/kfence.h:117:50: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
  117 | static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
      |                                                  ^~~~~~~~~~
./include/linux/kfence.h: In function ‘kfence_alloc’:
./include/linux/kfence.h:128:31: error: passing argument 1 of ‘__kfence_alloc’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  128 |         return __kfence_alloc(s, size, flags);
      |                               ^
      |                               |
      |                               struct kmem_cache *
--------------------------------------------------------------------

The root cause of this issue is that linux/kfence.h should be expanded
after linux/mm.h, not before.  That said, we can not put any
"high-level" header files in the "low-level" ones.

> And why is KFENCE_AREA_SIZE a little larger than
> KFENCE_POOL_SIZE? If we can reuse KFENCE_POOL_SIZE,
> KFENCE_AREA_START/KFENCE_AREA_END can be renamed to
> KFENCE_POOL_START/KFENCE_POOL_END.

+#define KFENCE_AREA_SIZE \
+       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
                                              ^^^^^
                                              
Here I've added two extra pages, that's due to working with
__get_vm_area_caller() to request the space correctly.

1. arch_kfence_init_pool
     __get_vm_area_caller
       __get_vm_area_node
         =================================
           if (!(flags & VM_NO_GUARD))
                   size += PAGE_SIZE;
         =================================

If we do not set VM_NO_GUARD, we would get one more page as "GUARD".
Setting VM_NO_GUARD is dangerous behavior and I suggest we keep this
page.

2. arch_kfence_init_pool
     __get_vm_area_caller
       __get_vm_area_node                        !!!This is my comment--
           =======================================                     |
           if (flags & VM_IOREMAP)                                     |
                   align = 1ul << clamp_t(int, ...                     |
           *** We got "align==0x200000" here.  Based on the default  <--
               KFENCE objects of 255, we got the maximum align here. ***
           =======================================

           alloc_vmap_area
             __alloc_vmap_area
               =================================
               nva_start_addr = ALIGN(vstart, align);
               *** When running here, the starting address will be
                   moved forward one byte due to alignment
                   requirements.  If we do not give enough space, we'll
                   fail on the next line. ***
               
               if (nva_start_addr + size > vend)
                       return vend;
               =================================
               
Theoretically, this alignment requires at most 2MB of space.  However,
considering that the starting address is fixed (the starting position is
determined by VMEMMAP_END), I think that adding another page will be
enough.

Best Regards,
Enze

>> +#else
>> +#define KFENCE_AREA_SIZE       0
>> +#endif
>> +
>>  /*
>>   * TLB refill handlers may also map the vmalloc area into xkvrange.
>>   * Avoid the first couple of pages so NULL pointer dereferences will
>> @@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
>>  #define VMALLOC_START  MODULES_END
>>  #define VMALLOC_END    \
>>         (vm_map_base +  \
>> -        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
>> +        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
>>
>>  #define vmemmap                ((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
>>  #define VMEMMAP_END    ((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
>>
>> +#ifdef CONFIG_KFENCE
>> +#define KFENCE_AREA_START      VMEMMAP_END
>> +#define KFENCE_AREA_END                (KFENCE_AREA_START + KFENCE_AREA_SIZE)
>> +#endif
>> +
>>  #define pte_ERROR(e) \
>>         pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
>>  #ifndef __PAGETABLE_PMD_FOLDED
>> diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
>> index da5b6d518cdb..c0319128b221 100644
>> --- a/arch/loongarch/mm/fault.c
>> +++ b/arch/loongarch/mm/fault.c
>> @@ -23,6 +23,7 @@
>>  #include <linux/kprobes.h>
>>  #include <linux/perf_event.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/kfence.h>
>>
>>  #include <asm/branch.h>
>>  #include <asm/mmu_context.h>
>> @@ -30,7 +31,8 @@
>>
>>  int show_unhandled_signals = 1;
>>
>> -static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>> +static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
>> +                                unsigned long write)
>>  {
>>         const int field = sizeof(unsigned long) * 2;
>>
>> @@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>>         if (fixup_exception(regs))
>>                 return;
>>
>> +       if (kfence_handle_page_fault(address, write, regs))
>> +               return;
>> +
>>         /*
>>          * Oops. The kernel tried to access some bad page. We'll have to
>>          * terminate things with extreme prejudice.
>> @@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
>>         die("Oops", regs);
>>  }
>>
>> -static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
>> +static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
>> +                                      unsigned long write)
>>  {
>>         /*
>>          * We ran out of memory, call the OOM killer, and return the userspace
>>          * (which will retry the fault, or kill us if we got oom-killed).
>>          */
>>         if (!user_mode(regs)) {
>> -               no_context(regs, address);
>> +               no_context(regs, address, write);
>>                 return;
>>         }
>>         pagefault_out_of_memory();
>> @@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
>>  {
>>         /* Kernel mode? Handle exceptions or die */
>>         if (!user_mode(regs)) {
>> -               no_context(regs, address);
>> +               no_context(regs, address, write);
>>                 return;
>>         }
>>
>> @@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
>>
>>         /* Kernel mode? Handle exceptions or die */
>>         if (!user_mode(regs)) {
>> -               no_context(regs, address);
>> +               no_context(regs, address, write);
>>                 return;
>>         }
>>
>> @@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>>          */
>>         if (address & __UA_LIMIT) {
>>                 if (!user_mode(regs))
>> -                       no_context(regs, address);
>> +                       no_context(regs, address, write);
>>                 else
>>                         do_sigsegv(regs, write, address, si_code);
>>                 return;
>> @@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>>
>>         if (fault_signal_pending(fault, regs)) {
>>                 if (!user_mode(regs))
>> -                       no_context(regs, address);
>> +                       no_context(regs, address, write);
>>                 return;
>>         }
>>
>> @@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
>>         if (unlikely(fault & VM_FAULT_ERROR)) {
>>                 mmap_read_unlock(mm);
>>                 if (fault & VM_FAULT_OOM) {
>> -                       do_out_of_memory(regs, address);
>> +                       do_out_of_memory(regs, address, write);
>>                         return;
>>                 } else if (fault & VM_FAULT_SIGSEGV) {
>>                         do_sigsegv(regs, write, address, si_code);
>> --
>> 2.34.1
>>
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-28  3:27     ` Enze Li
@ 2023-07-28  4:33       ` Huacai Chen
  0 siblings, 0 replies; 16+ messages in thread
From: Huacai Chen @ 2023-07-28  4:33 UTC (permalink / raw)
  To: Enze Li
  Cc: kernel, loongarch, glider, elver, akpm, kasan-dev, linux-mm,
	zhangqing, yangtiezhu, dvyukov

On Fri, Jul 28, 2023 at 11:28 AM Enze Li <lienze@kylinos.cn> wrote:
>
> On Thu, Jul 27 2023 at 09:26:04 AM +0800, Huacai Chen wrote:
>
> > On Tue, Jul 25, 2023 at 2:15 PM Enze Li <lienze@kylinos.cn> wrote:
> >>
> >> The LoongArch architecture is quite different from other architectures.
> >> When the allocating of KFENCE itself is done, it is mapped to the direct
> >> mapping configuration window [1] by default on LoongArch.  It means that
> >> it is not possible to use the page table mapped mode which required by
> >> the KFENCE system and therefore it should be remapped to the appropriate
> >> region.
> >>
> >> This patch adds architecture specific implementation details for KFENCE.
> >> In particular, this implements the required interface in <asm/kfence.h>.
> >>
> >> Tested this patch by running the testcases and all passed.
> >>
> >> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
> >>
> >> Signed-off-by: Enze Li <lienze@kylinos.cn>
> >> ---
> >>  arch/loongarch/Kconfig               |  1 +
> >>  arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
> >>  arch/loongarch/include/asm/pgtable.h | 14 ++++++-
> >>  arch/loongarch/mm/fault.c            | 22 ++++++----
> >>  4 files changed, 90 insertions(+), 9 deletions(-)
> >>  create mode 100644 arch/loongarch/include/asm/kfence.h
> >>
> >> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> >> index 70635ea3d1e4..5b63b16be49e 100644
> >> --- a/arch/loongarch/Kconfig
> >> +++ b/arch/loongarch/Kconfig
> >> @@ -91,6 +91,7 @@ config LOONGARCH
> >>         select HAVE_ARCH_AUDITSYSCALL
> >>         select HAVE_ARCH_JUMP_LABEL
> >>         select HAVE_ARCH_JUMP_LABEL_RELATIVE
> >> +       select HAVE_ARCH_KFENCE
> >>         select HAVE_ARCH_MMAP_RND_BITS if MMU
> >>         select HAVE_ARCH_SECCOMP_FILTER
> >>         select HAVE_ARCH_TRACEHOOK
> >> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
> >> new file mode 100644
> >> index 000000000000..fb39076fe4d7
> >> --- /dev/null
> >> +++ b/arch/loongarch/include/asm/kfence.h
> >> @@ -0,0 +1,62 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/*
> >> + * KFENCE support for LoongArch.
> >> + *
> >> + * Author: Enze Li <lienze@kylinos.cn>
> >> + * Copyright (C) 2022-2023 KylinSoft Corporation.
> >> + */
> >> +
> >> +#ifndef _ASM_LOONGARCH_KFENCE_H
> >> +#define _ASM_LOONGARCH_KFENCE_H
> >> +
> >> +#include <linux/kfence.h>
> >> +#include <asm/pgtable.h>
> >> +#include <asm/tlb.h>
> >> +
> >> +static inline bool arch_kfence_init_pool(void)
> >> +{
> >> +       char *kfence_pool = __kfence_pool;
> >> +       struct vm_struct *area;
> >> +       int err;
> >> +
> >> +       area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
> >> +                                   KFENCE_AREA_START, KFENCE_AREA_END,
> >> +                                   __builtin_return_address(0));
> >> +       if (!area)
> >> +               return false;
> >> +
> >> +       __kfence_pool = (char *)area->addr;
> >> +       err = ioremap_page_range((unsigned long)__kfence_pool,
> >> +                                (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
> >> +                                virt_to_phys((void *)kfence_pool),
> >> +                                PAGE_KERNEL);
> >> +       if (err) {
> >> +               free_vm_area(area);
> >> +               return false;
> >> +       }
> >> +
> >> +       return true;
> >> +}
> >> +
> >> +/* Protect the given page and flush TLB. */
> >> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> >> +{
> >> +       pte_t *pte = virt_to_kpte(addr);
> >> +
> >> +       if (WARN_ON(!pte) || pte_none(*pte))
> >> +               return false;
> >> +
> >> +       if (protect)
> >> +               set_pte(pte, __pte(pte_val(*pte) & ~(_PAGE_VALID | _PAGE_PRESENT)));
> >> +       else
> >> +               set_pte(pte, __pte(pte_val(*pte) | (_PAGE_VALID | _PAGE_PRESENT)));
> >> +
> >> +       /* Flush this CPU's TLB. */
> >> +       preempt_disable();
> >> +       local_flush_tlb_one(addr);
> >> +       preempt_enable();
> >> +
> >> +       return true;
> >> +}
> >> +
> >> +#endif /* _ASM_LOONGARCH_KFENCE_H */
> >> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> >> index 98a0c98de9d1..2702a6ba7122 100644
> >> --- a/arch/loongarch/include/asm/pgtable.h
> >> +++ b/arch/loongarch/include/asm/pgtable.h
> >> @@ -77,6 +77,13 @@ extern unsigned long zero_page_mask;
> >>         (virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))
> >>  #define __HAVE_COLOR_ZERO_PAGE
> >>
> >> +#ifdef CONFIG_KFENCE
> >> +#define KFENCE_AREA_SIZE \
> >> +       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
>
> Hi Huacai,
>
> > Another question: Why define KFENCE_AREA_SIZE while there is already
> > KFENCE_POOL_SIZE?
>
> The KFENCE_POOL_SIZE macro is defined in linux/kfence.h.  When I trying
> to include this header file, I see the following error,
>
> ----------------------------------------------------------------------
>   CC      arch/loongarch/kernel/asm-offsets.s
> In file included from ./arch/loongarch/include/asm/pgtable.h:64,
>                  from ./include/linux/pgtable.h:6,
>                  from ./include/linux/mm.h:29,
>                  from arch/loongarch/kernel/asm-offsets.c:9:
> ./include/linux/kfence.h:93:35: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
>    93 | void kfence_shutdown_cache(struct kmem_cache *s);
>       |                                   ^~~~~~~~~~
> ./include/linux/kfence.h:99:29: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
>    99 | void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags);
>       |                             ^~~~~~~~~~
> ./include/linux/kfence.h:117:50: warning: ‘struct kmem_cache’ declared inside parameter list will not be visible outside of this definition or declaration
>   117 | static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
>       |                                                  ^~~~~~~~~~
> ./include/linux/kfence.h: In function ‘kfence_alloc’:
> ./include/linux/kfence.h:128:31: error: passing argument 1 of ‘__kfence_alloc’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>   128 |         return __kfence_alloc(s, size, flags);
>       |                               ^
>       |                               |
>       |                               struct kmem_cache *
> --------------------------------------------------------------------
>
> The root cause of this issue is that linux/kfence.h should be expanded
> after linux/mm.h, not before.  That said, we can not put any
> "high-level" header files in the "low-level" ones.
>
> > And why is KFENCE_AREA_SIZE a little larger than
> > KFENCE_POOL_SIZE? If we can reuse KFENCE_POOL_SIZE,
> > KFENCE_AREA_START/KFENCE_AREA_END can be renamed to
> > KFENCE_POOL_START/KFENCE_POOL_END.
>
> +#define KFENCE_AREA_SIZE \
> +       (((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 + 2) * PAGE_SIZE)
>                                               ^^^^^
>
> Here I've added two extra pages, that's due to working with
> __get_vm_area_caller() to request the space correctly.
>
> 1. arch_kfence_init_pool
>      __get_vm_area_caller
>        __get_vm_area_node
>          =================================
>            if (!(flags & VM_NO_GUARD))
>                    size += PAGE_SIZE;
>          =================================
>
> If we do not set VM_NO_GUARD, we would get one more page as "GUARD".
> Setting VM_NO_GUARD is dangerous behavior and I suggest we keep this
> page.
>
> 2. arch_kfence_init_pool
>      __get_vm_area_caller
>        __get_vm_area_node                        !!!This is my comment--
>            =======================================                     |
>            if (flags & VM_IOREMAP)                                     |
>                    align = 1ul << clamp_t(int, ...                     |
>            *** We got "align==0x200000" here.  Based on the default  <--
>                KFENCE objects of 255, we got the maximum align here. ***
>            =======================================
>
>            alloc_vmap_area
>              __alloc_vmap_area
>                =================================
>                nva_start_addr = ALIGN(vstart, align);
>                *** When running here, the starting address will be
>                    moved forward one byte due to alignment
>                    requirements.  If we do not give enough space, we'll
>                    fail on the next line. ***
>
>                if (nva_start_addr + size > vend)
>                        return vend;
>                =================================
>
> Theoretically, this alignment requires at most 2MB of space.  However,
> considering that the starting address is fixed (the starting position is
> determined by VMEMMAP_END), I think that adding another page will be
> enough.
OK, that makes sense.

Huacai
>
> Best Regards,
> Enze
>
> >> +#else
> >> +#define KFENCE_AREA_SIZE       0
> >> +#endif
> >> +
> >>  /*
> >>   * TLB refill handlers may also map the vmalloc area into xkvrange.
> >>   * Avoid the first couple of pages so NULL pointer dereferences will
> >> @@ -88,11 +95,16 @@ extern unsigned long zero_page_mask;
> >>  #define VMALLOC_START  MODULES_END
> >>  #define VMALLOC_END    \
> >>         (vm_map_base +  \
> >> -        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE)
> >> +        min(PTRS_PER_PGD * PTRS_PER_PUD * PTRS_PER_PMD * PTRS_PER_PTE * PAGE_SIZE, (1UL << cpu_vabits)) - PMD_SIZE - VMEMMAP_SIZE - KFENCE_AREA_SIZE)
> >>
> >>  #define vmemmap                ((struct page *)((VMALLOC_END + PMD_SIZE) & PMD_MASK))
> >>  #define VMEMMAP_END    ((unsigned long)vmemmap + VMEMMAP_SIZE - 1)
> >>
> >> +#ifdef CONFIG_KFENCE
> >> +#define KFENCE_AREA_START      VMEMMAP_END
> >> +#define KFENCE_AREA_END                (KFENCE_AREA_START + KFENCE_AREA_SIZE)
> >> +#endif
> >> +
> >>  #define pte_ERROR(e) \
> >>         pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
> >>  #ifndef __PAGETABLE_PMD_FOLDED
> >> diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
> >> index da5b6d518cdb..c0319128b221 100644
> >> --- a/arch/loongarch/mm/fault.c
> >> +++ b/arch/loongarch/mm/fault.c
> >> @@ -23,6 +23,7 @@
> >>  #include <linux/kprobes.h>
> >>  #include <linux/perf_event.h>
> >>  #include <linux/uaccess.h>
> >> +#include <linux/kfence.h>
> >>
> >>  #include <asm/branch.h>
> >>  #include <asm/mmu_context.h>
> >> @@ -30,7 +31,8 @@
> >>
> >>  int show_unhandled_signals = 1;
> >>
> >> -static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> >> +static void __kprobes no_context(struct pt_regs *regs, unsigned long address,
> >> +                                unsigned long write)
> >>  {
> >>         const int field = sizeof(unsigned long) * 2;
> >>
> >> @@ -38,6 +40,9 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> >>         if (fixup_exception(regs))
> >>                 return;
> >>
> >> +       if (kfence_handle_page_fault(address, write, regs))
> >> +               return;
> >> +
> >>         /*
> >>          * Oops. The kernel tried to access some bad page. We'll have to
> >>          * terminate things with extreme prejudice.
> >> @@ -51,14 +56,15 @@ static void __kprobes no_context(struct pt_regs *regs, unsigned long address)
> >>         die("Oops", regs);
> >>  }
> >>
> >> -static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address)
> >> +static void __kprobes do_out_of_memory(struct pt_regs *regs, unsigned long address,
> >> +                                      unsigned long write)
> >>  {
> >>         /*
> >>          * We ran out of memory, call the OOM killer, and return the userspace
> >>          * (which will retry the fault, or kill us if we got oom-killed).
> >>          */
> >>         if (!user_mode(regs)) {
> >> -               no_context(regs, address);
> >> +               no_context(regs, address, write);
> >>                 return;
> >>         }
> >>         pagefault_out_of_memory();
> >> @@ -69,7 +75,7 @@ static void __kprobes do_sigbus(struct pt_regs *regs,
> >>  {
> >>         /* Kernel mode? Handle exceptions or die */
> >>         if (!user_mode(regs)) {
> >> -               no_context(regs, address);
> >> +               no_context(regs, address, write);
> >>                 return;
> >>         }
> >>
> >> @@ -90,7 +96,7 @@ static void __kprobes do_sigsegv(struct pt_regs *regs,
> >>
> >>         /* Kernel mode? Handle exceptions or die */
> >>         if (!user_mode(regs)) {
> >> -               no_context(regs, address);
> >> +               no_context(regs, address, write);
> >>                 return;
> >>         }
> >>
> >> @@ -149,7 +155,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
> >>          */
> >>         if (address & __UA_LIMIT) {
> >>                 if (!user_mode(regs))
> >> -                       no_context(regs, address);
> >> +                       no_context(regs, address, write);
> >>                 else
> >>                         do_sigsegv(regs, write, address, si_code);
> >>                 return;
> >> @@ -211,7 +217,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
> >>
> >>         if (fault_signal_pending(fault, regs)) {
> >>                 if (!user_mode(regs))
> >> -                       no_context(regs, address);
> >> +                       no_context(regs, address, write);
> >>                 return;
> >>         }
> >>
> >> @@ -232,7 +238,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
> >>         if (unlikely(fault & VM_FAULT_ERROR)) {
> >>                 mmap_read_unlock(mm);
> >>                 if (fault & VM_FAULT_OOM) {
> >> -                       do_out_of_memory(regs, address);
> >> +                       do_out_of_memory(regs, address, write);
> >>                         return;
> >>                 } else if (fault & VM_FAULT_SIGSEGV) {
> >>                         do_sigsegv(regs, write, address, si_code);
> >> --
> >> 2.34.1
> >>
> >>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4 v2] LoongArch: Add KFENCE support
  2023-07-25 14:34   ` Jackie Liu
@ 2023-07-28  6:01     ` Enze Li
  0 siblings, 0 replies; 16+ messages in thread
From: Enze Li @ 2023-07-28  6:01 UTC (permalink / raw)
  To: Jackie Liu
  Cc: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm, zhangqing, yangtiezhu, dvyukov

On Tue, Jul 25 2023 at 10:34:50 PM +0800, Jackie Liu wrote:

> 在 2023/7/25 14:14, Enze Li 写道:
>> The LoongArch architecture is quite different from other architectures.
>> When the allocating of KFENCE itself is done, it is mapped to the direct
>> mapping configuration window [1] by default on LoongArch.  It means that
>> it is not possible to use the page table mapped mode which required by
>> the KFENCE system and therefore it should be remapped to the appropriate
>> region.
>>
>> This patch adds architecture specific implementation details for KFENCE.
>> In particular, this implements the required interface in <asm/kfence.h>.
>>
>> Tested this patch by running the testcases and all passed.
>>
>> [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
>>
>> Signed-off-by: Enze Li <lienze@kylinos.cn>
>> ---
>>   arch/loongarch/Kconfig               |  1 +
>>   arch/loongarch/include/asm/kfence.h  | 62 ++++++++++++++++++++++++++++
>>   arch/loongarch/include/asm/pgtable.h | 14 ++++++-
>>   arch/loongarch/mm/fault.c            | 22 ++++++----
>>   4 files changed, 90 insertions(+), 9 deletions(-)
>>   create mode 100644 arch/loongarch/include/asm/kfence.h
>>
>> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
>> index 70635ea3d1e4..5b63b16be49e 100644
>> --- a/arch/loongarch/Kconfig
>> +++ b/arch/loongarch/Kconfig
>> @@ -91,6 +91,7 @@ config LOONGARCH
>>   	select HAVE_ARCH_AUDITSYSCALL
>>   	select HAVE_ARCH_JUMP_LABEL
>>   	select HAVE_ARCH_JUMP_LABEL_RELATIVE
>> +	select HAVE_ARCH_KFENCE
>>   	select HAVE_ARCH_MMAP_RND_BITS if MMU
>>   	select HAVE_ARCH_SECCOMP_FILTER
>>   	select HAVE_ARCH_TRACEHOOK
>> diff --git a/arch/loongarch/include/asm/kfence.h b/arch/loongarch/include/asm/kfence.h
>> new file mode 100644
>> index 000000000000..fb39076fe4d7
>> --- /dev/null
>> +++ b/arch/loongarch/include/asm/kfence.h
>> @@ -0,0 +1,62 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * KFENCE support for LoongArch.
>> + *
>> + * Author: Enze Li <lienze@kylinos.cn>
>> + * Copyright (C) 2022-2023 KylinSoft Corporation.
>> + */
>> +
>> +#ifndef _ASM_LOONGARCH_KFENCE_H
>> +#define _ASM_LOONGARCH_KFENCE_H
>> +
>> +#include <linux/kfence.h>
>> +#include <asm/pgtable.h>
>> +#include <asm/tlb.h>
>> +
>> +static inline bool arch_kfence_init_pool(void)
>> +{
>> +	char *kfence_pool = __kfence_pool;
>> +	struct vm_struct *area;
>> +	int err;
>> +
>> +	area = __get_vm_area_caller(KFENCE_POOL_SIZE, VM_IOREMAP,
>> +				    KFENCE_AREA_START, KFENCE_AREA_END,
>> +				    __builtin_return_address(0));
>> +	if (!area)
>> +		return false;
>> +
>> +	__kfence_pool = (char *)area->addr;
>
> I think there should be something wrong here.
>
>> +	err = ioremap_page_range((unsigned long)__kfence_pool,
>> +				 (unsigned long)__kfence_pool + KFENCE_POOL_SIZE,
>> +				 virt_to_phys((void *)kfence_pool),
>> +				 PAGE_KERNEL);
>> +	if (err) {
>> +		free_vm_area(area);
>
> If err > 0, return area->addr here, It's not correct.

Hi Jackie,

Good catch!  I'll fix this issue in v3.

Cheers!
Enze



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter
  2023-07-26  2:59   ` Jinyang He
@ 2023-07-28  6:57     ` Enze Li
  0 siblings, 0 replies; 16+ messages in thread
From: Enze Li @ 2023-07-28  6:57 UTC (permalink / raw)
  To: Jinyang He
  Cc: chenhuacai, kernel, loongarch, glider, elver, akpm, kasan-dev,
	linux-mm, yangtiezhu, dvyukov

On Wed, Jul 26 2023 at 10:59:06 AM +0800, Jinyang He wrote:

> On 2023-07-25 14:14, Enze Li wrote:
>
>> Currently, arch_stack_walk() can only get the full stack information
>> including NMI.  This is because the implementation of arch_stack_walk()
>> is forced to ignore the information passed by the regs parameter and use
>> the current stack information instead.
>>
>> For some detection systems like KFENCE, only partial stack information
>> is needed.  In particular, the stack frame where the interrupt occurred.
>>
>> To support KFENCE, this patch modifies the implementation of the
>> arch_stack_walk() function so that if this function is called with the
>> regs argument passed, it retains all the stack information in regs and
>> uses it to provide accurate information.
>>
>> Before the patch applied, I get,
>> [    1.531195 ] ==================================================================
>> [    1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c
>> [    1.531442 ]
>> [    1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12):
>> [    1.532046 ]  stack_trace_save_regs+0x48/0x6c
>> [    1.532169 ]  kfence_report_error+0xa4/0x528
>> [    1.532276 ]  kfence_handle_page_fault+0x124/0x270
>> [    1.532388 ]  no_context+0x50/0x94
>> [    1.532453 ]  do_page_fault+0x1a8/0x36c
>> [    1.532524 ]  tlb_do_page_fault_0+0x118/0x1b4
>> [    1.532623 ]  test_out_of_bounds_read+0xa0/0x1d8
>> [    1.532745 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
>> [    1.532854 ]  kthread+0x124/0x130
>> [    1.532922 ]  ret_from_kernel_thread+0xc/0xa4
>> <snip>
>>
>> With this patch applied, I get the correct stack information.
>> [    1.320220 ] ==================================================================
>> [    1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8
>> [    1.320401 ]
>> [    1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10):
>> [    1.321134 ]  test_out_of_bounds_read+0xa8/0x1d8
>> [    1.321264 ]  kunit_generic_run_threadfn_adapter+0x1c/0x28
>> [    1.321392 ]  kthread+0x124/0x130
>> [    1.321459 ]  ret_from_kernel_thread+0xc/0xa4
>> <snip>
>>
>> Signed-off-by: Enze Li <lienze@kylinos.cn>
>> ---
>>   arch/loongarch/kernel/stacktrace.c | 20 ++++++++++++++------
>>   1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
>> index 2463d2fea21f..9dab30ae68ec 100644
>> --- a/arch/loongarch/kernel/stacktrace.c
>> +++ b/arch/loongarch/kernel/stacktrace.c
>> @@ -18,16 +18,24 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
>>   	struct pt_regs dummyregs;
>>   	struct unwind_state state;
>>   -	regs = &dummyregs;
>> -
>>   	if (task == current) {
>> -		regs->regs[3] = (unsigned long)__builtin_frame_address(0);
>> -		regs->csr_era = (unsigned long)__builtin_return_address(0);
>> +		if (regs)
>> +			memcpy(&dummyregs, regs, sizeof(*regs));
>> +		else {
>> +			dummyregs.regs[3] = (unsigned long)__builtin_frame_address(0);
>> +			dummyregs.csr_era = (unsigned long)__builtin_return_address(0);
>> +		}
>>   	} else {
>> -		regs->regs[3] = thread_saved_fp(task);
>> -		regs->csr_era = thread_saved_ra(task);
>> +		if (regs)
>> +			memcpy(&dummyregs, regs, sizeof(*regs));
>> +		else {
>> +			dummyregs.regs[3] = thread_saved_fp(task);
>> +			dummyregs.csr_era = thread_saved_ra(task);
>> +		}
>>   	}
>>   +	regs = &dummyregs;
>> +

Hi Jinyang,

>
> if (!regs) {
>     regs = &dummyregs;
>
>     if (task == current) {
>         regs->regs[3] = (unsigned long)__builtin_frame_address(0);
>         regs->csr_era = (unsigned long)__builtin_return_address(0);
>     } else {
>         regs->regs[3] = thread_saved_fp(task);
>         regs->csr_era = thread_saved_ra(task);
>     }
>     regs->regs[1] = 0;
> }

Excellent!  FWIW, it looks easy to understand.
I've tested this patch, and it works well.  Thank you.

Cheers!
Enze

>
> BTW, I remembered that __unwind_start() deals with this issue in regs,
> task and current. arch_stack_walk() is unnecessary to provide current
> or task regs if we fix the unwind_start() skip its parent frame
> (caller is arch_stack_walk). But the current state is better, I think.
>
>
> Thanks,
>
> Jinyang
>
>>   	regs->regs[1] = 0;
>>   	for (unwind_start(&state, task, regs);
>>   	     !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-07-28  6:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-25  6:14 [PATCH 0/4 v2] Add KFENCE support for LoongArch Enze Li
2023-07-25  6:14 ` [PATCH 1/4 v2] LoongArch: mm: Add page table mapped mode support Enze Li
2023-07-25  7:38   ` Huacai Chen
2023-07-25  6:14 ` [PATCH 2/4 v2] LoongArch: Get stack without NMI when providing regs parameter Enze Li
2023-07-25  7:40   ` Huacai Chen
2023-07-26  2:59   ` Jinyang He
2023-07-28  6:57     ` Enze Li
2023-07-25  6:14 ` [PATCH 3/4 v2] KFENCE: Defer the assignment of the local variable addr Enze Li
2023-07-25  7:45   ` Huacai Chen
2023-07-25  6:14 ` [PATCH 4/4 v2] LoongArch: Add KFENCE support Enze Li
2023-07-25  7:48   ` Huacai Chen
2023-07-25 14:34   ` Jackie Liu
2023-07-28  6:01     ` Enze Li
2023-07-27  1:26   ` Huacai Chen
2023-07-28  3:27     ` Enze Li
2023-07-28  4:33       ` Huacai Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).