linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] I-side fixes
@ 2018-06-22  8:31 Will Deacon
  2018-06-22  8:31 ` [PATCH v2 1/4] arm64: Avoid flush_icache_range() in alternatives patching code Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Will Deacon @ 2018-06-22  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Here's version two of the patches I previously posted here:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2018-June/584857.html

Changes since v1 include:

  * Removal of "hotpatch" mechanism for certain classes of instructions
  * Avoid attempting IPI when secondaries are in stop_machine()
  * Leave ISBs intact in the stop_machine() callbacks

Cheers,

Will

--->8

Will Deacon (4):
  arm64: Avoid flush_icache_range() in alternatives patching code
  arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}
  arm64: IPI each CPU after invalidating the I-cache for kernel mappings
  arm64: insn: Don't fallback on nosync path for general insn patching

 arch/arm64/include/asm/alternative.h |  7 +++-
 arch/arm64/include/asm/cacheflush.h  | 13 ++++++-
 arch/arm64/include/asm/insn.h        |  2 --
 arch/arm64/include/asm/pgtable.h     |  6 +---
 arch/arm64/kernel/alternative.c      | 51 ++++++++++++++++++++++----
 arch/arm64/kernel/cpu_errata.c       |  2 +-
 arch/arm64/kernel/insn.c             | 70 ++----------------------------------
 arch/arm64/kernel/module.c           |  5 ++-
 arch/arm64/mm/cache.S                |  4 +--
 9 files changed, 71 insertions(+), 89 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/4] arm64: Avoid flush_icache_range() in alternatives patching code
  2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
@ 2018-06-22  8:31 ` Will Deacon
  2018-06-22  8:31 ` [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud} Will Deacon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2018-06-22  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

The implementation of flush_icache_range() includes instruction sequences
which are themselves patched at runtime, so it is not safe to call from
the patching framework.

This patch reworks the alternatives cache-flushing code so that it rolls
its own internal D-cache maintenance using DC CIVAC before invalidating
the entire I-cache after all alternatives have been applied at boot.
Modules don't cause any issues, since flush_icache_range() is safe to
call by the time they are loaded.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Rohit Khanna <rokhanna@nvidia.com>
Cc: Alexander Van Brunt <avanbrunt@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/alternative.h |  7 ++++-
 arch/arm64/kernel/alternative.c      | 51 +++++++++++++++++++++++++++++++-----
 arch/arm64/kernel/module.c           |  5 ++--
 3 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index a91933b1e2e6..4b650ec1d7dd 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -28,7 +28,12 @@ typedef void (*alternative_cb_t)(struct alt_instr *alt,
 				 __le32 *origptr, __le32 *updptr, int nr_inst);
 
 void __init apply_alternatives_all(void);
-void apply_alternatives(void *start, size_t length);
+
+#ifdef CONFIG_MODULES
+void apply_alternatives_module(void *start, size_t length);
+#else
+static inline void apply_alternatives_module(void *start, size_t length) { }
+#endif
 
 #define ALTINSTR_ENTRY(feature,cb)					      \
 	" .word 661b - .\n"				/* label           */ \
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 5c4bce4ac381..36fb069fd049 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -122,7 +122,30 @@ static void patch_alternative(struct alt_instr *alt,
 	}
 }
 
-static void __apply_alternatives(void *alt_region, bool use_linear_alias)
+/*
+ * We provide our own, private D-cache cleaning function so that we don't
+ * accidentally call into the cache.S code, which is patched by us at
+ * runtime.
+ */
+static void clean_dcache_range_nopatch(u64 start, u64 end)
+{
+	u64 cur, d_size, ctr_el0;
+
+	ctr_el0 = read_sanitised_ftr_reg(SYS_CTR_EL0);
+	d_size = 4 << cpuid_feature_extract_unsigned_field(ctr_el0,
+							   CTR_DMINLINE_SHIFT);
+	cur = start & ~(d_size - 1);
+	do {
+		/*
+		 * We must clean+invalidate to the PoC in order to avoid
+		 * Cortex-A53 errata 826319, 827319, 824069 and 819472
+		 * (this corresponds to ARM64_WORKAROUND_CLEAN_CACHE)
+		 */
+		asm volatile("dc civac, %0" : : "r" (cur) : "memory");
+	} while (cur += d_size, cur < end);
+}
+
+static void __apply_alternatives(void *alt_region, bool is_module)
 {
 	struct alt_instr *alt;
 	struct alt_region *region = alt_region;
@@ -145,7 +168,7 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 		pr_info_once("patching kernel code\n");
 
 		origptr = ALT_ORIG_PTR(alt);
-		updptr = use_linear_alias ? lm_alias(origptr) : origptr;
+		updptr = is_module ? origptr : lm_alias(origptr);
 		nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
 
 		if (alt->cpufeature < ARM64_CB_PATCH)
@@ -155,8 +178,20 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 
 		alt_cb(alt, origptr, updptr, nr_inst);
 
-		flush_icache_range((uintptr_t)origptr,
-				   (uintptr_t)(origptr + nr_inst));
+		if (!is_module) {
+			clean_dcache_range_nopatch((u64)origptr,
+						   (u64)(origptr + nr_inst));
+		}
+	}
+
+	/*
+	 * The core module code takes care of cache maintenance in
+	 * flush_module_icache().
+	 */
+	if (!is_module) {
+		dsb(ish);
+		__flush_icache_all();
+		isb();
 	}
 }
 
@@ -178,7 +213,7 @@ static int __apply_alternatives_multi_stop(void *unused)
 		isb();
 	} else {
 		BUG_ON(alternatives_applied);
-		__apply_alternatives(&region, true);
+		__apply_alternatives(&region, false);
 		/* Barriers provided by the cache flushing */
 		WRITE_ONCE(alternatives_applied, 1);
 	}
@@ -192,12 +227,14 @@ void __init apply_alternatives_all(void)
 	stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
 }
 
-void apply_alternatives(void *start, size_t length)
+#ifdef CONFIG_MODULES
+void apply_alternatives_module(void *start, size_t length)
 {
 	struct alt_region region = {
 		.begin	= start,
 		.end	= start + length,
 	};
 
-	__apply_alternatives(&region, false);
+	__apply_alternatives(&region, true);
 }
+#endif
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 155fd91e78f4..f0f27aeefb73 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -448,9 +448,8 @@ int module_finalize(const Elf_Ehdr *hdr,
 	const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
 
 	for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
-		if (strcmp(".altinstructions", secstrs + s->sh_name) == 0) {
-			apply_alternatives((void *)s->sh_addr, s->sh_size);
-		}
+		if (strcmp(".altinstructions", secstrs + s->sh_name) == 0)
+			apply_alternatives_module((void *)s->sh_addr, s->sh_size);
 #ifdef CONFIG_ARM64_MODULE_PLTS
 		if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
 		    !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}
  2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
  2018-06-22  8:31 ` [PATCH v2 1/4] arm64: Avoid flush_icache_range() in alternatives patching code Will Deacon
@ 2018-06-22  8:31 ` Will Deacon
  2018-06-22 10:46   ` Steve Capper
  2018-06-22  8:31 ` [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings Will Deacon
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2018-06-22  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

Commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications")
fixed a reported issue with fixmap page-table entries not being visible
to the walker due to a missing DSB instruction. At the same time, it added
ISB instructions to the arm64 set_{pte,pmd,pud} functions, which are not
required by the architecture and make little sense in isolation.

Remove the redundant ISBs.

Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 9f82d6b53851..1bdeca8918a6 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -224,10 +224,8 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
 	 * Only if the new pte is valid and kernel, otherwise TLB maintenance
 	 * or update_mmu_cache() have the necessary barriers.
 	 */
-	if (pte_valid_not_user(pte)) {
+	if (pte_valid_not_user(pte))
 		dsb(ishst);
-		isb();
-	}
 }
 
 extern void __sync_icache_dcache(pte_t pteval);
@@ -434,7 +432,6 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
 	WRITE_ONCE(*pmdp, pmd);
 	dsb(ishst);
-	isb();
 }
 
 static inline void pmd_clear(pmd_t *pmdp)
@@ -485,7 +482,6 @@ static inline void set_pud(pud_t *pudp, pud_t pud)
 {
 	WRITE_ONCE(*pudp, pud);
 	dsb(ishst);
-	isb();
 }
 
 static inline void pud_clear(pud_t *pudp)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings
  2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
  2018-06-22  8:31 ` [PATCH v2 1/4] arm64: Avoid flush_icache_range() in alternatives patching code Will Deacon
  2018-06-22  8:31 ` [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud} Will Deacon
@ 2018-06-22  8:31 ` Will Deacon
  2018-06-28 16:00   ` Catalin Marinas
  2018-06-22  8:31 ` [PATCH v2 4/4] arm64: insn: Don't fallback on nosync path for general insn patching Will Deacon
  2018-06-28 16:01 ` [PATCH v2 0/4] I-side fixes Catalin Marinas
  4 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2018-06-22  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

When invalidating the instruction cache for a kernel mapping via
flush_icache_range(), it is also necessary to flush the pipeline for
other CPUs so that instructions fetched into the pipeline before the
I-cache invalidation are discarded. For example, if module 'foo' is
unloaded and then module 'bar' is loaded into the same area of memory,
a CPU could end up executing instructions from 'foo' when branching into
'bar' if these instructions were fetched into the pipeline before 'foo'
was unloaded.

Whilst this is highly unlikely to occur in practice, particularly as
any exception acts as a context-synchronizing operation, following the
letter of the architecture requires us to execute an ISB on each CPU
in order for the new instruction stream to be visible.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/cacheflush.h | 13 ++++++++++++-
 arch/arm64/kernel/cpu_errata.c      |  2 +-
 arch/arm64/kernel/insn.c            | 18 ++++--------------
 arch/arm64/mm/cache.S               |  4 ++--
 4 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index d264a7274811..a0ec27066e6f 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -71,7 +71,7 @@
  *		- kaddr  - page address
  *		- size   - region size
  */
-extern void flush_icache_range(unsigned long start, unsigned long end);
+extern void __flush_icache_range(unsigned long start, unsigned long end);
 extern int  invalidate_icache_range(unsigned long start, unsigned long end);
 extern void __flush_dcache_area(void *addr, size_t len);
 extern void __inval_dcache_area(void *addr, size_t len);
@@ -81,6 +81,17 @@ extern void __clean_dcache_area_pou(void *addr, size_t len);
 extern long __flush_cache_user_range(unsigned long start, unsigned long end);
 extern void sync_icache_aliases(void *kaddr, unsigned long len);
 
+static inline void flush_icache_range(unsigned long start, unsigned long end)
+{
+	__flush_icache_range(start, end);
+
+	/*
+	 * IPI all online CPUs so that they undergo a context synchronization
+	 * event and are forced to refetch the new instructions.
+	 */
+	kick_all_cpus_sync();
+}
+
 static inline void flush_cache_mm(struct mm_struct *mm)
 {
 }
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1d2b6d768efe..0a338a1cd2d7 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -101,7 +101,7 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
 	for (i = 0; i < SZ_2K; i += 0x80)
 		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
 
-	flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
 }
 
 static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 816d03c4c913..0f6a2e0cfde0 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -216,8 +216,8 @@ int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
 
 	ret = aarch64_insn_write(tp, insn);
 	if (ret == 0)
-		flush_icache_range((uintptr_t)tp,
-				   (uintptr_t)tp + AARCH64_INSN_SIZE);
+		__flush_icache_range((uintptr_t)tp,
+				     (uintptr_t)tp + AARCH64_INSN_SIZE);
 
 	return ret;
 }
@@ -283,18 +283,8 @@ int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
 		if (ret)
 			return ret;
 
-		if (aarch64_insn_hotpatch_safe(insn, insns[0])) {
-			/*
-			 * ARMv8 architecture doesn't guarantee all CPUs see
-			 * the new instruction after returning from function
-			 * aarch64_insn_patch_text_nosync(). So send IPIs to
-			 * all other CPUs to achieve instruction
-			 * synchronization.
-			 */
-			ret = aarch64_insn_patch_text_nosync(addrs[0], insns[0]);
-			kick_all_cpus_sync();
-			return ret;
-		}
+		if (aarch64_insn_hotpatch_safe(insn, insns[0]))
+			return aarch64_insn_patch_text_nosync(addrs[0], insns[0]);
 	}
 
 	return aarch64_insn_patch_text_sync(addrs, insns, cnt);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 30334d81b021..0c22ede52f90 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -35,7 +35,7 @@
  *	- start   - virtual start address of region
  *	- end     - virtual end address of region
  */
-ENTRY(flush_icache_range)
+ENTRY(__flush_icache_range)
 	/* FALLTHROUGH */
 
 /*
@@ -77,7 +77,7 @@ alternative_else_nop_endif
 9:
 	mov	x0, #-EFAULT
 	b	1b
-ENDPROC(flush_icache_range)
+ENDPROC(__flush_icache_range)
 ENDPROC(__flush_cache_user_range)
 
 /*
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/4] arm64: insn: Don't fallback on nosync path for general insn patching
  2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
                   ` (2 preceding siblings ...)
  2018-06-22  8:31 ` [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings Will Deacon
@ 2018-06-22  8:31 ` Will Deacon
  2018-06-28 16:01 ` [PATCH v2 0/4] I-side fixes Catalin Marinas
  4 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2018-06-22  8:31 UTC (permalink / raw)
  To: linux-arm-kernel

Patching kernel instructions at runtime requires other CPUs to undergo
a context synchronisation event via an explicit ISB or an IPI in order
to ensure that the new instructions are visible. This is required even
for "hotpatch" instructions such as NOP and BL, so avoid optimising in
this case and always go via stop_machine() when performing general
patching.

ftrace isn't quite as strict, so it can continue to call the nosync
code directly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/insn.h |  2 --
 arch/arm64/kernel/insn.c      | 56 +------------------------------------------
 2 files changed, 1 insertion(+), 57 deletions(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index f62c56b1793f..c6802dea6cab 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -446,8 +446,6 @@ u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 s32 aarch64_get_branch_offset(u32 insn);
 u32 aarch64_set_branch_offset(u32 insn, s32 offset);
 
-bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
-
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
 int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
 
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 0f6a2e0cfde0..2b3413549734 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -149,20 +149,6 @@ int __kprobes aarch64_insn_write(void *addr, u32 insn)
 	return __aarch64_insn_write(addr, cpu_to_le32(insn));
 }
 
-static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
-{
-	if (aarch64_get_insn_class(insn) != AARCH64_INSN_CLS_BR_SYS)
-		return false;
-
-	return	aarch64_insn_is_b(insn) ||
-		aarch64_insn_is_bl(insn) ||
-		aarch64_insn_is_svc(insn) ||
-		aarch64_insn_is_hvc(insn) ||
-		aarch64_insn_is_smc(insn) ||
-		aarch64_insn_is_brk(insn) ||
-		aarch64_insn_is_nop(insn);
-}
-
 bool __kprobes aarch64_insn_uses_literal(u32 insn)
 {
 	/* ldr/ldrsw (literal), prfm */
@@ -189,22 +175,6 @@ bool __kprobes aarch64_insn_is_branch(u32 insn)
 		aarch64_insn_is_bcond(insn);
 }
 
-/*
- * ARM Architecture Reference Manual for ARMv8 Profile-A, Issue A.a
- * Section B2.6.5 "Concurrent modification and execution of instructions":
- * Concurrent modification and execution of instructions can lead to the
- * resulting instruction performing any behavior that can be achieved by
- * executing any sequence of instructions that can be executed from the
- * same Exception level, except where the instruction before modification
- * and the instruction after modification is a B, BL, NOP, BKPT, SVC, HVC,
- * or SMC instruction.
- */
-bool __kprobes aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn)
-{
-	return __aarch64_insn_hotpatch_safe(old_insn) &&
-	       __aarch64_insn_hotpatch_safe(new_insn);
-}
-
 int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
 {
 	u32 *tp = addr;
@@ -239,11 +209,6 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
 		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
 			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
 							     pp->new_insns[i]);
-		/*
-		 * aarch64_insn_patch_text_nosync() calls flush_icache_range(),
-		 * which ends with "dsb; isb" pair guaranteeing global
-		 * visibility.
-		 */
 		/* Notify other processors with an additional increment. */
 		atomic_inc(&pp->cpu_count);
 	} else {
@@ -255,8 +220,7 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
 	return ret;
 }
 
-static
-int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
+int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
 {
 	struct aarch64_insn_patch patch = {
 		.text_addrs = addrs,
@@ -272,24 +236,6 @@ int __kprobes aarch64_insn_patch_text_sync(void *addrs[], u32 insns[], int cnt)
 				       cpu_online_mask);
 }
 
-int __kprobes aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt)
-{
-	int ret;
-	u32 insn;
-
-	/* Unsafe to patch multiple instructions without synchronizaiton */
-	if (cnt == 1) {
-		ret = aarch64_insn_read(addrs[0], &insn);
-		if (ret)
-			return ret;
-
-		if (aarch64_insn_hotpatch_safe(insn, insns[0]))
-			return aarch64_insn_patch_text_nosync(addrs[0], insns[0]);
-	}
-
-	return aarch64_insn_patch_text_sync(addrs, insns, cnt);
-}
-
 static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
 						u32 *maskp, int *shiftp)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}
  2018-06-22  8:31 ` [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud} Will Deacon
@ 2018-06-22 10:46   ` Steve Capper
  0 siblings, 0 replies; 10+ messages in thread
From: Steve Capper @ 2018-06-22 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

[Apologies for getting to this late]

Hi Will,

On Fri, Jun 22, 2018 at 09:31:16AM +0100, Will Deacon wrote:
> Commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications")
> fixed a reported issue with fixmap page-table entries not being visible
> to the walker due to a missing DSB instruction. At the same time, it added
> ISB instructions to the arm64 set_{pte,pmd,pud} functions, which are not
> required by the architecture and make little sense in isolation.
> 
> Remove the redundant ISBs.
> 

I agree, we don't want to update any cached local execution context
information whilst setting page table entries and TLB maintenance is
handled elsewhere; so the ISBs are redundant.

FWIW: Acked-by: Steve Capper <steve.capper@linaro.org>

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings
  2018-06-22  8:31 ` [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings Will Deacon
@ 2018-06-28 16:00   ` Catalin Marinas
  2018-06-28 16:46     ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2018-06-28 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Fri, Jun 22, 2018 at 09:31:17AM +0100, Will Deacon wrote:
> When invalidating the instruction cache for a kernel mapping via
> flush_icache_range(), it is also necessary to flush the pipeline for
> other CPUs so that instructions fetched into the pipeline before the
> I-cache invalidation are discarded. For example, if module 'foo' is
> unloaded and then module 'bar' is loaded into the same area of memory,
> a CPU could end up executing instructions from 'foo' when branching into
> 'bar' if these instructions were fetched into the pipeline before 'foo'
> was unloaded.
> 
> Whilst this is highly unlikely to occur in practice, particularly as
> any exception acts as a context-synchronizing operation, following the
> letter of the architecture requires us to execute an ISB on each CPU
> in order for the new instruction stream to be visible.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>

I hit a warning via kgdb_flush_swbreak_addr() because if IRQs disabled
(and it actually deadlocks for me; running as a guest under KVM on TX1):

WARNING: CPU: 3 PID: 1 at /home/cmarinas/work/Linux/linux-2.6-aarch64/kernel/smp.c:416 smp_call_function_many+0xd4/0x350
Modules linked in:
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2-00006-ga5cfc8429d36 #97
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 200003c5 (nzCv DAIF -PAN -UAO)
pc : smp_call_function_many+0xd4/0x350
lr : smp_call_function+0x38/0x68
sp : ffff0000080737f0
x29: ffff0000080737f0 x28: ffff000009169000 
x27: 0000000000000003 x26: 0000000000000000 
x25: ffff00000815a3d0 x24: 0000000000000001 
x23: 0000000000000000 x22: ffff000008eba730 
x21: ffff000009169940 x20: 0000000000000000 
x19: 0000000000000003 x18: ffffffffffffffff 
x17: 0000000000000001 x16: 0000000000000019 
x15: ffff0000091696c8 x14: ffff00008930ef07 
x13: ffff00000930ef1d x12: 0000000000000010 
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
x9 : 656565652aff6223 x8 : ffffffffffffffff 
x7 : fefefefefefefefe x6 : ffff7dfffe7fe014 
x5 : ffff000009169940 x4 : ffff000009147018 
x3 : 0000000000000001 x2 : 0000000000000000 
x1 : ffff00000815a3d0 x0 : 0000000000000000 
Call trace:
 smp_call_function_many+0xd4/0x350
 smp_call_function+0x38/0x68
 kick_all_cpus_sync+0x20/0x28
 kgdb_flush_swbreak_addr+0x14/0x20
 dbg_activate_sw_breakpoints+0x74/0xb0
 gdb_serial_stub+0x6ec/0xc80
 kgdb_cpu_enter+0x36c/0x578
 kgdb_handle_exception+0x130/0x238
 kgdb_compiled_brk_fn+0x28/0x38
 brk_handler+0x80/0xd8
 do_debug_exception+0x94/0x170
 el1_dbg+0x18/0x78
 kgdb_breakpoint+0x20/0x40
 run_breakpoint_test+0x58/0xb0
 configure_kgdbts+0x1b8/0x538
 init_kgdbts+0x1c/0x2c
 do_one_initcall+0x58/0x170
 kernel_init_freeable+0x184/0x22c
 kernel_init+0x10/0x108
 ret_from_fork+0x10/0x18

-- 
Catalin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 0/4] I-side fixes
  2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
                   ` (3 preceding siblings ...)
  2018-06-22  8:31 ` [PATCH v2 4/4] arm64: insn: Don't fallback on nosync path for general insn patching Will Deacon
@ 2018-06-28 16:01 ` Catalin Marinas
  4 siblings, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2018-06-28 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 22, 2018 at 09:31:14AM +0100, Will Deacon wrote:
> Will Deacon (4):
>   arm64: Avoid flush_icache_range() in alternatives patching code
>   arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}

I'm merging these two patches for -rc3. I'll leave the other two to you
for 4.19.

-- 
Catalin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings
  2018-06-28 16:00   ` Catalin Marinas
@ 2018-06-28 16:46     ` Will Deacon
  2018-06-29 14:52       ` Catalin Marinas
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2018-06-28 16:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Thu, Jun 28, 2018 at 05:00:05PM +0100, Catalin Marinas wrote:
> On Fri, Jun 22, 2018 at 09:31:17AM +0100, Will Deacon wrote:
> > When invalidating the instruction cache for a kernel mapping via
> > flush_icache_range(), it is also necessary to flush the pipeline for
> > other CPUs so that instructions fetched into the pipeline before the
> > I-cache invalidation are discarded. For example, if module 'foo' is
> > unloaded and then module 'bar' is loaded into the same area of memory,
> > a CPU could end up executing instructions from 'foo' when branching into
> > 'bar' if these instructions were fetched into the pipeline before 'foo'
> > was unloaded.
> > 
> > Whilst this is highly unlikely to occur in practice, particularly as
> > any exception acts as a context-synchronizing operation, following the
> > letter of the architecture requires us to execute an ISB on each CPU
> > in order for the new instruction stream to be visible.
> > 
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> 
> I hit a warning via kgdb_flush_swbreak_addr() because if IRQs disabled
> (and it actually deadlocks for me; running as a guest under KVM on TX1):
> 
> WARNING: CPU: 3 PID: 1 at /home/cmarinas/work/Linux/linux-2.6-aarch64/kernel/smp.c:416 smp_call_function_many+0xd4/0x350
> Modules linked in:
> CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2-00006-ga5cfc8429d36 #97
> Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> pstate: 200003c5 (nzCv DAIF -PAN -UAO)
> pc : smp_call_function_many+0xd4/0x350
> lr : smp_call_function+0x38/0x68
> sp : ffff0000080737f0
> x29: ffff0000080737f0 x28: ffff000009169000 
> x27: 0000000000000003 x26: 0000000000000000 
> x25: ffff00000815a3d0 x24: 0000000000000001 
> x23: 0000000000000000 x22: ffff000008eba730 
> x21: ffff000009169940 x20: 0000000000000000 
> x19: 0000000000000003 x18: ffffffffffffffff 
> x17: 0000000000000001 x16: 0000000000000019 
> x15: ffff0000091696c8 x14: ffff00008930ef07 
> x13: ffff00000930ef1d x12: 0000000000000010 
> x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
> x9 : 656565652aff6223 x8 : ffffffffffffffff 
> x7 : fefefefefefefefe x6 : ffff7dfffe7fe014 
> x5 : ffff000009169940 x4 : ffff000009147018 
> x3 : 0000000000000001 x2 : 0000000000000000 
> x1 : ffff00000815a3d0 x0 : 0000000000000000 
> Call trace:
>  smp_call_function_many+0xd4/0x350
>  smp_call_function+0x38/0x68
>  kick_all_cpus_sync+0x20/0x28
>  kgdb_flush_swbreak_addr+0x14/0x20
>  dbg_activate_sw_breakpoints+0x74/0xb0
>  gdb_serial_stub+0x6ec/0xc80
>  kgdb_cpu_enter+0x36c/0x578

Yuck, this is horrible. We don't actually need the IPI in this case because
kgdb is using smp_call_function to roundup the secondary cores, but the core
code doesn't have the flexibility for us to hook the operation like that.
All it provides is a CACHE_FLUSH_IS_SAFE #define, which you can set to 0 if
the maintenance isn't safe in IRQ-disabled context. However, there isn't a
fall-back and the maintenance is just not performed at all in that case.

If we had a "flush the entire D-cache to PoU" instruction, we could hack
something into the backend (MIPS does this) but we don't. The best I can
come up with is the nasty hack below.

Will

--->8

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index a0ec27066e6f..5a15d3ce3f0e 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -89,6 +89,19 @@ static inline void flush_icache_range(unsigned long start, unsigned long end)
 	 * IPI all online CPUs so that they undergo a context synchronization
 	 * event and are forced to refetch the new instructions.
 	 */
+#ifdef CONFIG_KGDB
+	/*
+	 * KGDB performs cache maintenance with interrupts disabled, so we
+	 * will deadlock trying to IPI the secondary CPUs. In theory, we can
+	 * set CACHE_FLUSH_IS_SAFE to 0 to avoid this known issue, but that
+	 * just means that KGDB will elide the maintenance altogether! As it
+	 * turns out, KGDB uses IPIs to round-up the secondary CPUs during
+	 * the patching operation, so we don't need extra IPIs here anyway.
+	 * In which case, add a KGDB-specific bodge and return early.
+	 */
+	if (kgdb_connected && irqs_disabled())
+		return;
+#endif
 	kick_all_cpus_sync();
 }
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings
  2018-06-28 16:46     ` Will Deacon
@ 2018-06-29 14:52       ` Catalin Marinas
  0 siblings, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2018-06-29 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 28, 2018 at 05:46:47PM +0100, Will Deacon wrote:
> On Thu, Jun 28, 2018 at 05:00:05PM +0100, Catalin Marinas wrote:
> > On Fri, Jun 22, 2018 at 09:31:17AM +0100, Will Deacon wrote:
> > > When invalidating the instruction cache for a kernel mapping via
> > > flush_icache_range(), it is also necessary to flush the pipeline for
> > > other CPUs so that instructions fetched into the pipeline before the
> > > I-cache invalidation are discarded. For example, if module 'foo' is
> > > unloaded and then module 'bar' is loaded into the same area of memory,
> > > a CPU could end up executing instructions from 'foo' when branching into
> > > 'bar' if these instructions were fetched into the pipeline before 'foo'
> > > was unloaded.
> > > 
> > > Whilst this is highly unlikely to occur in practice, particularly as
> > > any exception acts as a context-synchronizing operation, following the
> > > letter of the architecture requires us to execute an ISB on each CPU
> > > in order for the new instruction stream to be visible.
> > > 
> > > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > 
> > I hit a warning via kgdb_flush_swbreak_addr() because if IRQs disabled
> > (and it actually deadlocks for me; running as a guest under KVM on TX1):
[...]
> Yuck, this is horrible. We don't actually need the IPI in this case because
> kgdb is using smp_call_function to roundup the secondary cores, but the core
> code doesn't have the flexibility for us to hook the operation like that.
> All it provides is a CACHE_FLUSH_IS_SAFE #define, which you can set to 0 if
> the maintenance isn't safe in IRQ-disabled context. However, there isn't a
> fall-back and the maintenance is just not performed at all in that case.
> 
> If we had a "flush the entire D-cache to PoU" instruction, we could hack
> something into the backend (MIPS does this) but we don't. The best I can
> come up with is the nasty hack below.
> 
> Will
> 
> --->8
> 
> diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
> index a0ec27066e6f..5a15d3ce3f0e 100644
> --- a/arch/arm64/include/asm/cacheflush.h
> +++ b/arch/arm64/include/asm/cacheflush.h
> @@ -89,6 +89,19 @@ static inline void flush_icache_range(unsigned long start, unsigned long end)
>  	 * IPI all online CPUs so that they undergo a context synchronization
>  	 * event and are forced to refetch the new instructions.
>  	 */
> +#ifdef CONFIG_KGDB
> +	/*
> +	 * KGDB performs cache maintenance with interrupts disabled, so we
> +	 * will deadlock trying to IPI the secondary CPUs. In theory, we can
> +	 * set CACHE_FLUSH_IS_SAFE to 0 to avoid this known issue, but that
> +	 * just means that KGDB will elide the maintenance altogether! As it
> +	 * turns out, KGDB uses IPIs to round-up the secondary CPUs during
> +	 * the patching operation, so we don't need extra IPIs here anyway.
> +	 * In which case, add a KGDB-specific bodge and return early.
> +	 */
> +	if (kgdb_connected && irqs_disabled())
> +		return;
> +#endif
>  	kick_all_cpus_sync();
>  }

It's indeed a hack but we can live with this. On the patch with this
hunk:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-06-29 14:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-22  8:31 [PATCH v2 0/4] I-side fixes Will Deacon
2018-06-22  8:31 ` [PATCH v2 1/4] arm64: Avoid flush_icache_range() in alternatives patching code Will Deacon
2018-06-22  8:31 ` [PATCH v2 2/4] arm64: Remove unnecessary ISBs from set_{pte,pmd,pud} Will Deacon
2018-06-22 10:46   ` Steve Capper
2018-06-22  8:31 ` [PATCH v2 3/4] arm64: IPI each CPU after invalidating the I-cache for kernel mappings Will Deacon
2018-06-28 16:00   ` Catalin Marinas
2018-06-28 16:46     ` Will Deacon
2018-06-29 14:52       ` Catalin Marinas
2018-06-22  8:31 ` [PATCH v2 4/4] arm64: insn: Don't fallback on nosync path for general insn patching Will Deacon
2018-06-28 16:01 ` [PATCH v2 0/4] I-side fixes Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).