linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Runtime constants series
@ 2024-07-09 15:52 Linus Torvalds
  2024-07-09 15:52 ` [PATCH 1/4] vfs: dcache: move hashlen_hash() from callers into d_hash() Linus Torvalds
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-07-09 15:52 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Linux ARM

This is also available in the (possibly still to be rebased) git tree at

  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git runtime-constants

for people who just prefer to do a 'git fetch' and look at things
locally. 

Four patches to make the dentry cache lookup use the new runtime
constants feature rather than load the hash table address and size
dynamically:

 vfs: dcache: move hashlen_hash() from callers into d_hash()
 runtime constants: add default dummy infrastructure
 runtime constants: add x86 architecture support
 arm64: add 'runtime constant' support

with the diffstat being

 arch/arm64/include/asm/runtime-const.h | 88 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S        |  3 ++
 arch/x86/include/asm/runtime-const.h   | 61 +++++++++++++++++++++++
 arch/x86/kernel/vmlinux.lds.S          |  3 ++
 fs/dcache.c                            | 17 +++++--
 include/asm-generic/Kbuild             |  1 +
 include/asm-generic/runtime-const.h    | 15 ++++++
 include/asm-generic/vmlinux.lds.h      |  8 ++++
 8 files changed, 192 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/include/asm/runtime-const.h
 create mode 100644 arch/x86/include/asm/runtime-const.h
 create mode 100644 include/asm-generic/runtime-const.h

Thanks,

           Linus


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] vfs: dcache: move hashlen_hash() from callers into d_hash()
  2024-07-09 15:52 [PATCH 0/4] Runtime constants series Linus Torvalds
@ 2024-07-09 15:52 ` Linus Torvalds
  2024-07-09 15:52 ` [PATCH 2/4] runtime constants: add default dummy infrastructure Linus Torvalds
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-07-09 15:52 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds

Both __d_lookup_rcu() and __d_lookup_rcu_op_compare() have the full
'name_hash' value of the qstr that they want to look up, and mask it off
to just the low 32-bit hash before calling down to d_hash().

Other callers just load the 32-bit hash and pass it as the argument.

If we move the masking into d_hash() itself, it simplifies the two
callers that currently do the masking, and is a no-op for the other
cases.  It doesn't actually change the generated code since the compiler
will inline d_hash() and see that the end result is the same.

[ Technically, since the parse tree changes, the code generation may not
  be 100% the same, and for me on x86-64, this does result in gcc
  switching the operands around for one 'cmpl' instruction. So not
  necessarily the exact same code generation, but equivalent ]

However, this does encapsulate the 'd_hash()' operation more, and makes
the shift operation in particular be a "shift 32 bits right, return full
word".  Which matches the instruction semantics on both x86-64 and arm64
better, since a 32-bit shift will clear the upper bits.

That makes the next step of introducing a "shift by runtime constant"
more obvious and generates the shift with no extraneous type masking.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 fs/dcache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 407095188f83..8b4382f5c99d 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -100,9 +100,9 @@ static unsigned int d_hash_shift __ro_after_init;
 
 static struct hlist_bl_head *dentry_hashtable __ro_after_init;
 
-static inline struct hlist_bl_head *d_hash(unsigned int hash)
+static inline struct hlist_bl_head *d_hash(unsigned long hashlen)
 {
-	return dentry_hashtable + (hash >> d_hash_shift);
+	return dentry_hashtable + ((u32)hashlen >> d_hash_shift);
 }
 
 #define IN_LOOKUP_SHIFT 10
@@ -2104,7 +2104,7 @@ static noinline struct dentry *__d_lookup_rcu_op_compare(
 	unsigned *seqp)
 {
 	u64 hashlen = name->hash_len;
-	struct hlist_bl_head *b = d_hash(hashlen_hash(hashlen));
+	struct hlist_bl_head *b = d_hash(hashlen);
 	struct hlist_bl_node *node;
 	struct dentry *dentry;
 
@@ -2171,7 +2171,7 @@ struct dentry *__d_lookup_rcu(const struct dentry *parent,
 {
 	u64 hashlen = name->hash_len;
 	const unsigned char *str = name->name;
-	struct hlist_bl_head *b = d_hash(hashlen_hash(hashlen));
+	struct hlist_bl_head *b = d_hash(hashlen);
 	struct hlist_bl_node *node;
 	struct dentry *dentry;
 
-- 
2.45.1.209.gc6f12300df



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] runtime constants: add default dummy infrastructure
  2024-07-09 15:52 [PATCH 0/4] Runtime constants series Linus Torvalds
  2024-07-09 15:52 ` [PATCH 1/4] vfs: dcache: move hashlen_hash() from callers into d_hash() Linus Torvalds
@ 2024-07-09 15:52 ` Linus Torvalds
  2024-07-09 15:52 ` [PATCH 3/4] runtime constants: add x86 architecture support Linus Torvalds
  2024-07-09 15:52 ` [PATCH 4/4] arm64: add 'runtime constant' support Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-07-09 15:52 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds

This adds the initial dummy support for 'runtime constants' for when
an architecture doesn't actually support an implementation of fixing
up said runtime constants.

This ends up being the fallback to just using the variables as regular
__ro_after_init variables, and changes the dcache d_hash() function to
use this model.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 fs/dcache.c                         | 11 ++++++++++-
 include/asm-generic/Kbuild          |  1 +
 include/asm-generic/runtime-const.h | 15 +++++++++++++++
 include/asm-generic/vmlinux.lds.h   |  8 ++++++++
 4 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 include/asm-generic/runtime-const.h

diff --git a/fs/dcache.c b/fs/dcache.c
index 8b4382f5c99d..5d3a5b315692 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -35,6 +35,8 @@
 #include "internal.h"
 #include "mount.h"
 
+#include <asm/runtime-const.h>
+
 /*
  * Usage:
  * dcache->d_inode->i_lock protects:
@@ -102,7 +104,8 @@ static struct hlist_bl_head *dentry_hashtable __ro_after_init;
 
 static inline struct hlist_bl_head *d_hash(unsigned long hashlen)
 {
-	return dentry_hashtable + ((u32)hashlen >> d_hash_shift);
+	return runtime_const_ptr(dentry_hashtable) +
+		runtime_const_shift_right_32(hashlen, d_hash_shift);
 }
 
 #define IN_LOOKUP_SHIFT 10
@@ -3129,6 +3132,9 @@ static void __init dcache_init_early(void)
 					0,
 					0);
 	d_hash_shift = 32 - d_hash_shift;
+
+	runtime_const_init(shift, d_hash_shift);
+	runtime_const_init(ptr, dentry_hashtable);
 }
 
 static void __init dcache_init(void)
@@ -3157,6 +3163,9 @@ static void __init dcache_init(void)
 					0,
 					0);
 	d_hash_shift = 32 - d_hash_shift;
+
+	runtime_const_init(shift, d_hash_shift);
+	runtime_const_init(ptr, dentry_hashtable);
 }
 
 /* SLAB cache for __getname() consumers */
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index b20fa25a7e8d..052e5c98c105 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -46,6 +46,7 @@ mandatory-y += pci.h
 mandatory-y += percpu.h
 mandatory-y += pgalloc.h
 mandatory-y += preempt.h
+mandatory-y += runtime-const.h
 mandatory-y += rwonce.h
 mandatory-y += sections.h
 mandatory-y += serial.h
diff --git a/include/asm-generic/runtime-const.h b/include/asm-generic/runtime-const.h
new file mode 100644
index 000000000000..670499459514
--- /dev/null
+++ b/include/asm-generic/runtime-const.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RUNTIME_CONST_H
+#define _ASM_RUNTIME_CONST_H
+
+/*
+ * This is the fallback for when the architecture doesn't
+ * support the runtime const operations.
+ *
+ * We just use the actual symbols as-is.
+ */
+#define runtime_const_ptr(sym) (sym)
+#define runtime_const_shift_right_32(val, sym) ((u32)(val)>>(sym))
+#define runtime_const_init(type,sym) do { } while (0)
+
+#endif
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5703526d6ebf..389a78415b9b 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -944,6 +944,14 @@
 #define CON_INITCALL							\
 	BOUNDED_SECTION_POST_LABEL(.con_initcall.init, __con_initcall, _start, _end)
 
+#define RUNTIME_NAME(t,x) runtime_##t##_##x
+
+#define RUNTIME_CONST(t,x)						\
+	. = ALIGN(8);							\
+	RUNTIME_NAME(t,x) : AT(ADDR(RUNTIME_NAME(t,x)) - LOAD_OFFSET) {	\
+		*(RUNTIME_NAME(t,x));					\
+	}
+
 /* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */
 #define KUNIT_TABLE()							\
 		. = ALIGN(8);						\
-- 
2.45.1.209.gc6f12300df



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] runtime constants: add x86 architecture support
  2024-07-09 15:52 [PATCH 0/4] Runtime constants series Linus Torvalds
  2024-07-09 15:52 ` [PATCH 1/4] vfs: dcache: move hashlen_hash() from callers into d_hash() Linus Torvalds
  2024-07-09 15:52 ` [PATCH 2/4] runtime constants: add default dummy infrastructure Linus Torvalds
@ 2024-07-09 15:52 ` Linus Torvalds
  2024-07-09 15:52 ` [PATCH 4/4] arm64: add 'runtime constant' support Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-07-09 15:52 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds

This implements the runtime constant infrastructure for x86, allowing
the dcache d_hash() function to be generated using as a constant for
hash table address followed by shift by a constant of the hash index.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 arch/x86/include/asm/runtime-const.h | 61 ++++++++++++++++++++++++++++
 arch/x86/kernel/vmlinux.lds.S        |  3 ++
 2 files changed, 64 insertions(+)
 create mode 100644 arch/x86/include/asm/runtime-const.h

diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h
new file mode 100644
index 000000000000..24e3a53ca255
--- /dev/null
+++ b/arch/x86/include/asm/runtime-const.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RUNTIME_CONST_H
+#define _ASM_RUNTIME_CONST_H
+
+#define runtime_const_ptr(sym) ({				\
+	typeof(sym) __ret;					\
+	asm_inline("mov %1,%0\n1:\n"				\
+		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"	\
+		".long 1b - %c2 - .\n\t"			\
+		".popsection"					\
+		:"=r" (__ret)					\
+		:"i" ((unsigned long)0x0123456789abcdefull),	\
+		 "i" (sizeof(long)));				\
+	__ret; })
+
+// The 'typeof' will create at _least_ a 32-bit type, but
+// will happily also take a bigger type and the 'shrl' will
+// clear the upper bits
+#define runtime_const_shift_right_32(val, sym) ({		\
+	typeof(0u+(val)) __ret = (val);				\
+	asm_inline("shrl $12,%k0\n1:\n"				\
+		".pushsection runtime_shift_" #sym ",\"a\"\n\t"	\
+		".long 1b - 1 - .\n\t"				\
+		".popsection"					\
+		:"+r" (__ret));					\
+	__ret; })
+
+#define runtime_const_init(type, sym) do {		\
+	extern s32 __start_runtime_##type##_##sym[];	\
+	extern s32 __stop_runtime_##type##_##sym[];	\
+	runtime_const_fixup(__runtime_fixup_##type,	\
+		(unsigned long)(sym), 			\
+		__start_runtime_##type##_##sym,		\
+		__stop_runtime_##type##_##sym);		\
+} while (0)
+
+/*
+ * The text patching is trivial - you can only do this at init time,
+ * when the text section hasn't been marked RO, and before the text
+ * has ever been executed.
+ */
+static inline void __runtime_fixup_ptr(void *where, unsigned long val)
+{
+	*(unsigned long *)where = val;
+}
+
+static inline void __runtime_fixup_shift(void *where, unsigned long val)
+{
+	*(unsigned char *)where = val;
+}
+
+static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
+	unsigned long val, s32 *start, s32 *end)
+{
+	while (start < end) {
+		fn(*start + (void *)start, val);
+		start++;
+	}
+}
+
+#endif
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 3509afc6a672..6e73403e874f 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -357,6 +357,9 @@ SECTIONS
 	PERCPU_SECTION(INTERNODE_CACHE_BYTES)
 #endif
 
+	RUNTIME_CONST(shift, d_hash_shift)
+	RUNTIME_CONST(ptr, dentry_hashtable)
+
 	. = ALIGN(PAGE_SIZE);
 
 	/* freed after init ends here */
-- 
2.45.1.209.gc6f12300df



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] arm64: add 'runtime constant' support
  2024-07-09 15:52 [PATCH 0/4] Runtime constants series Linus Torvalds
                   ` (2 preceding siblings ...)
  2024-07-09 15:52 ` [PATCH 3/4] runtime constants: add x86 architecture support Linus Torvalds
@ 2024-07-09 15:52 ` Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-07-09 15:52 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds

This implements the runtime constant infrastructure for arm64, allowing
the dcache d_hash() function to be generated using as a constant for
hash table address followed by shift by a constant of the hash index.

[ Fixed up to deal with the big-endian case as per Mark Rutland ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 arch/arm64/include/asm/runtime-const.h | 88 ++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S        |  3 +
 2 files changed, 91 insertions(+)
 create mode 100644 arch/arm64/include/asm/runtime-const.h

diff --git a/arch/arm64/include/asm/runtime-const.h b/arch/arm64/include/asm/runtime-const.h
new file mode 100644
index 000000000000..be5915669d23
--- /dev/null
+++ b/arch/arm64/include/asm/runtime-const.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RUNTIME_CONST_H
+#define _ASM_RUNTIME_CONST_H
+
+#include <asm/cacheflush.h>
+
+/* Sigh. You can still run arm64 in BE mode */
+#include <asm/byteorder.h>
+
+#define runtime_const_ptr(sym) ({				\
+	typeof(sym) __ret;					\
+	asm_inline("1:\t"					\
+		"movz %0, #0xcdef\n\t"				\
+		"movk %0, #0x89ab, lsl #16\n\t"			\
+		"movk %0, #0x4567, lsl #32\n\t"			\
+		"movk %0, #0x0123, lsl #48\n\t"			\
+		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"	\
+		".long 1b - .\n\t"				\
+		".popsection"					\
+		:"=r" (__ret));					\
+	__ret; })
+
+#define runtime_const_shift_right_32(val, sym) ({		\
+	unsigned long __ret;					\
+	asm_inline("1:\t"					\
+		"lsr %w0,%w1,#12\n\t"				\
+		".pushsection runtime_shift_" #sym ",\"a\"\n\t"	\
+		".long 1b - .\n\t"				\
+		".popsection"					\
+		:"=r" (__ret)					\
+		:"r" (0u+(val)));				\
+	__ret; })
+
+#define runtime_const_init(type, sym) do {		\
+	extern s32 __start_runtime_##type##_##sym[];	\
+	extern s32 __stop_runtime_##type##_##sym[];	\
+	runtime_const_fixup(__runtime_fixup_##type,	\
+		(unsigned long)(sym), 			\
+		__start_runtime_##type##_##sym,		\
+		__stop_runtime_##type##_##sym);		\
+} while (0)
+
+/* 16-bit immediate for wide move (movz and movk) in bits 5..20 */
+static inline void __runtime_fixup_16(__le32 *p, unsigned int val)
+{
+	u32 insn = le32_to_cpu(*p);
+	insn &= 0xffe0001f;
+	insn |= (val & 0xffff) << 5;
+	*p = cpu_to_le32(insn);
+}
+
+static inline void __runtime_fixup_caches(void *where, unsigned int insns)
+{
+	unsigned long va = (unsigned long)where;
+	caches_clean_inval_pou(va, va + 4*insns);
+}
+
+static inline void __runtime_fixup_ptr(void *where, unsigned long val)
+{
+	__le32 *p = lm_alias(where);
+	__runtime_fixup_16(p, val);
+	__runtime_fixup_16(p+1, val >> 16);
+	__runtime_fixup_16(p+2, val >> 32);
+	__runtime_fixup_16(p+3, val >> 48);
+	__runtime_fixup_caches(where, 4);
+}
+
+/* Immediate value is 6 bits starting at bit #16 */
+static inline void __runtime_fixup_shift(void *where, unsigned long val)
+{
+	__le32 *p = lm_alias(where);
+	u32 insn = le32_to_cpu(*p);
+	insn &= 0xffc0ffff;
+	insn |= (val & 63) << 16;
+	*p = cpu_to_le32(insn);
+	__runtime_fixup_caches(where, 1);
+}
+
+static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
+	unsigned long val, s32 *start, s32 *end)
+{
+	while (start < end) {
+		fn(*start + (void *)start, val);
+		start++;
+	}
+}
+
+#endif
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 755a22d4f840..55a8e310ea12 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -264,6 +264,9 @@ SECTIONS
 		EXIT_DATA
 	}
 
+	RUNTIME_CONST(shift, d_hash_shift)
+	RUNTIME_CONST(ptr, dentry_hashtable)
+
 	PERCPU_SECTION(L1_CACHE_BYTES)
 	HYPERVISOR_PERCPU_SECTION
 
-- 
2.45.1.209.gc6f12300df



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-09 16:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-09 15:52 [PATCH 0/4] Runtime constants series Linus Torvalds
2024-07-09 15:52 ` [PATCH 1/4] vfs: dcache: move hashlen_hash() from callers into d_hash() Linus Torvalds
2024-07-09 15:52 ` [PATCH 2/4] runtime constants: add default dummy infrastructure Linus Torvalds
2024-07-09 15:52 ` [PATCH 3/4] runtime constants: add x86 architecture support Linus Torvalds
2024-07-09 15:52 ` [PATCH 4/4] arm64: add 'runtime constant' support Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).