From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9DBEDC27C5E for ; Tue, 11 Jun 2024 14:29:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ZqCQSuTMJDMzUkec3BwZaaSIp9Vqp6WvMZszbYfb0Uo=; b=unCnzlS0amo2qf 8QgZS8ZCPIiw6c2LDN5TjnEDTES+8iJludflH5MFWu/m0bq94PcieRznfgU1QxlvNKHXK6uCFHgwq HK0wN1GsVwRKo7f32DWd49vIlr53QivUfwAJ5hisKolnKDlU0ZiwqSsRl94qAiMxiETcjHr7jiksJ QA7oX+HBmKvLqT5Z+jJZYTxP0NYA63c2kEWawEp5rx1s0TClyEH0YQGdHMEXH2CKDweHa7Mk+GDnC yld8leVai5exORpXaTOQD3Lt7ylD1ei+Ewdyayixc6pdr2n8Rj0UNcTYW6rwrX+44Ov/n2sPywD9u bf3DD2QDb7SPKgzJzFgA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sH2V3-000000099Gg-2G3q; Tue, 11 Jun 2024 14:29:25 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sH2V1-000000099Ff-03sD for linux-arm-kernel@lists.infradead.org; Tue, 11 Jun 2024 14:29:24 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 58A28152B; Tue, 11 Jun 2024 07:29:42 -0700 (PDT) Received: from J2N7QTR9R3 (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 952273F5A1; Tue, 11 Jun 2024 07:29:15 -0700 (PDT) Date: Tue, 11 Jun 2024 15:29:09 +0100 From: Mark Rutland To: Linus Torvalds Cc: Peter Anvin , Ingo Molnar , Borislav Petkov , Thomas Gleixner , Rasmus Villemoes , Josh Poimboeuf , Catalin Marinas , Will Deacon , Linux Kernel Mailing List , the arch/x86 maintainers , linux-arm-kernel@lists.infradead.org, linux-arch Subject: Re: [PATCH 4/7] arm64: add 'runtime constant' support Message-ID: References: <20240610204821.230388-1-torvalds@linux-foundation.org> <20240610204821.230388-5-torvalds@linux-foundation.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240610204821.230388-5-torvalds@linux-foundation.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240611_072923_199836_8922E8DF X-CRM114-Status: GOOD ( 28.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Linus, On Mon, Jun 10, 2024 at 01:48:18PM -0700, Linus Torvalds wrote: > This implements the runtime constant infrastructure for arm64, allowing > the dcache d_hash() function to be generated using as a constant for > hash table address followed by shift by a constant of the hash index. > > Signed-off-by: Linus Torvalds > --- > arch/arm64/include/asm/runtime-const.h | 75 ++++++++++++++++++++++++++ > arch/arm64/kernel/vmlinux.lds.S | 3 ++ > 2 files changed, 78 insertions(+) > create mode 100644 arch/arm64/include/asm/runtime-const.h >From a quick scan I spotted a couple of issues (explained with fixes below). I haven't had the chance to test/review the whole series yet. Do we expect to use this more widely? If this only really matters for d_hash() it might be better to handle this via the alternatives framework with callbacks and avoid the need for new infrastructure. > diff --git a/arch/arm64/include/asm/runtime-const.h b/arch/arm64/include/asm/runtime-const.h > new file mode 100644 > index 000000000000..02462b2cb6f9 > --- /dev/null > +++ b/arch/arm64/include/asm/runtime-const.h > @@ -0,0 +1,75 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_RUNTIME_CONST_H > +#define _ASM_RUNTIME_CONST_H > + > +#define runtime_const_ptr(sym) ({ \ > + typeof(sym) __ret; \ > + asm_inline("1:\t" \ > + "movz %0, #0xcdef\n\t" \ > + "movk %0, #0x89ab, lsl #16\n\t" \ > + "movk %0, #0x4567, lsl #32\n\t" \ > + "movk %0, #0x0123, lsl #48\n\t" \ > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > + ".long 1b - .\n\t" \ > + ".popsection" \ > + :"=r" (__ret)); \ > + __ret; }) > + > +#define runtime_const_shift_right_32(val, sym) ({ \ > + unsigned long __ret; \ > + asm_inline("1:\t" \ > + "lsr %w0,%w1,#12\n\t" \ > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ > + ".long 1b - .\n\t" \ > + ".popsection" \ > + :"=r" (__ret) \ > + :"r" (0u+(val))); \ > + __ret; }) > + > +#define runtime_const_init(type, sym) do { \ > + extern s32 __start_runtime_##type##_##sym[]; \ > + extern s32 __stop_runtime_##type##_##sym[]; \ > + runtime_const_fixup(__runtime_fixup_##type, \ > + (unsigned long)(sym), \ > + __start_runtime_##type##_##sym, \ > + __stop_runtime_##type##_##sym); \ > +} while (0) > + > +// 16-bit immediate for wide move (movz and movk) in bits 5..20 > +static inline void __runtime_fixup_16(unsigned int *p, unsigned int val) > +{ > + unsigned int insn = *p; > + insn &= 0xffe0001f; > + insn |= (val & 0xffff) << 5; > + *p = insn; > +} As-is this will break BE kernels as instructions are always encoded little-endian regardless of data endianness. We usually handle that by using __le32 instruction pointers and using le32_to_cpu()/cpu_to_le32() when reading/writing, e.g. #include static inline void __runtime_fixup_16(__le32 *p, unsigned int val) { u32 insn = le32_to_cpu(*p); insn &= 0xffe0001f; insn |= (val & 0xffff) << 5; *p = cpu_to_le32(insn); } We have some helpers for instruction manipulation, and we can use aarch64_insn_encode_immediate() here, e.g. #include static inline void __runtime_fixup_16(__le32 *p, unsigned int val) { u32 insn = le32_to_cpu(*p); insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, val); *p = cpu_to_le32(insn); } > +static inline void __runtime_fixup_ptr(void *where, unsigned long val) > +{ > + unsigned int *p = lm_alias(where); > + __runtime_fixup_16(p, val); > + __runtime_fixup_16(p+1, val >> 16); > + __runtime_fixup_16(p+2, val >> 32); > + __runtime_fixup_16(p+3, val >> 48); > +} This is missing the necessary cache maintenance and context synchronization event. After the new values are written, we need cache maintenance (a D$ clean to PoU, then an I$ invalidate to PoU) followed by a context synchronization event (e.g. an ISB) before CPUs are guaranteed to use the new instructions rather than the old ones. Depending on how your system has been integrated, you might get away without that. If you see: Data cache clean to the PoU not required for I/D coherence ... in your dmesg, that means you only need the I$ invalidate and context synchronization event, and you might happen to get those by virtue of alternative patching running between dcache_init_early() and the use of the patched instructions. However, in general, we do need all of that. As long as this runs before secondaries are brought up, we can handle that with caches_clean_inval_pou(). Assuming the __le32 changes above, I'd expect this to be: static inline void __runtime_fixup_ptr(void *where, unsigned long val) { __le32 *p = lm_alias(where); __runtime_fixup_16(p, val); __runtime_fixup_16(p + 1, val >> 16); __runtime_fixup_16(p + 2, val >> 32); __runtime_fixup_16(p + 3, val >> 48); caches_clean_inval_pou((unsigned long)p, (unsigned long)(p + 4)); } > +// Immediate value is 5 bits starting at bit #16 > +static inline void __runtime_fixup_shift(void *where, unsigned long val) > +{ > + unsigned int *p = lm_alias(where); > + unsigned int insn = *p; > + insn &= 0xffc0ffff; > + insn |= (val & 63) << 16; > + *p = insn; > +} As with the other bits above, I'd expect this to be: static inline void __runtime_fixup_shift(void *where, unsigned long val) { __le32 *p = lm_alias(where); u32 insn = le32_to_cpu(*p); insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, val); *p = cpu_to_le32(insn); caches_clean_inval_pou((unsigned long)p, (unsigned long)(p + 1)); } Mark. > + > +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), > + unsigned long val, s32 *start, s32 *end) > +{ > + while (start < end) { > + fn(*start + (void *)start, val); > + start++; > + } > +} > + > +#endif > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S > index 755a22d4f840..55a8e310ea12 100644 > --- a/arch/arm64/kernel/vmlinux.lds.S > +++ b/arch/arm64/kernel/vmlinux.lds.S > @@ -264,6 +264,9 @@ SECTIONS > EXIT_DATA > } > > + RUNTIME_CONST(shift, d_hash_shift) > + RUNTIME_CONST(ptr, dentry_hashtable) > + > PERCPU_SECTION(L1_CACHE_BYTES) > HYPERVISOR_PERCPU_SECTION > > -- > 2.45.1.209.gc6f12300df > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel