All of lore.kernel.org
 help / color / mirror / Atom feed
From: Charlie Jenkins <charlie@rivosinc.com>
To: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Ben Dooks <ben.dooks@codethink.co.uk>,
	Pasha Bouzarjomehri <pasha@rivosinc.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] riscv: Add runtime constant support
Date: Fri, 31 Jan 2025 13:23:02 -0800	[thread overview]
Message-ID: <Z50_NsMX731u5JBh@ghost> (raw)
In-Reply-To: <CAJM55Z-9-WHY+31=6j-wAiaZdaPv_aH_=P6LEi16twgYH0FL6g@mail.gmail.com>

On Fri, Jan 31, 2025 at 06:46:23AM -0800, Emil Renner Berthing wrote:
> Charlie Jenkins wrote:
> > On Wed, Jan 29, 2025 at 04:30:49AM -0500, Emil Renner Berthing wrote:
> > > Charlie Jenkins wrote:
> > > > Implement the runtime constant infrastructure for riscv. Use this
> > > > infrastructure to generate constants to be used by the d_hash()
> > > > function.
> > > >
> > > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> > > > constant' support") and commit e3c92e81711d ("runtime constants: add
> > > > x86 architecture support").
> > > >
> > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > > > ---
> > > > Ard brought this to my attention in this patch [1].
> > > >
> > > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/
> > > > ---
> > > > Changes in v3:
> > > > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path
> > > >   by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com)
> > > > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com
> > > >
> > > > Changes in v2:
> > > > - Treat instructions as __le32 and do proper conversions (Ben)
> > > > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com
> > > > ---
> > > >  arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++
> > > >  arch/riscv/kernel/vmlinux.lds.S        |   3 +
> > > >  2 files changed, 197 insertions(+)
> > > >
> > > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> > > > new file mode 100644
> > > > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076
> > > > --- /dev/null
> > > > +++ b/arch/riscv/include/asm/runtime-const.h
> > > > @@ -0,0 +1,194 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +#ifndef _ASM_RISCV_RUNTIME_CONST_H
> > > > +#define _ASM_RISCV_RUNTIME_CONST_H
> > > > +
> > > > +#include <asm/alternative.h>
> > > > +#include <asm/cacheflush.h>
> > > > +#include <asm/text-patching.h>
> > > > +#include <linux/uaccess.h>
> > > > +
> > > > +#ifdef CONFIG_32BIT
> > > > +#define runtime_const_ptr(sym)					\
> > > > +({								\
> > > > +	typeof(sym) __ret, __tmp;				\
> > > > +	asm_inline("1:\t"					\
> > > > +		".option push"					\
> > > > +		".option norvc"					\
> > > > +		"lui	%[__ret],0x89abd\n\t"			\
> > > > +		"addi	%[__ret],-0x211\n\t"			\
> > > > +		".option pop"					\
> > > > +		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"	\
> > > > +		".long 1b - .\n\t"				\
> > > > +		".popsection"					\
> > > > +		: [__ret] "=r" (__ret));			\
> > > > +	__ret;							\
> > > > +})
> > > > +#else
> > > > +/*
> > > > + * Loading 64-bit constants into a register from immediates is a non-trivial
> > > > + * task on riscv64. To get it somewhat performant, load 32 bits into two
> > > > + * different registers and then combine the results.
> > > > + *
> > > > + * If the processor supports the Zbkb extension, we can combine the final
> > > > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor
> > > > + * doesn't support Zbkb but does support the Zbb extension, we can
> > > > + * combine the final "slli,srli,add" into one instruction "add.uw".
> > > > + */
> > > > +#define runtime_const_ptr(sym)						\
> > > > +({									\
> > > > +	typeof(sym) __ret, __tmp;					\
> > > > +	asm_inline("1:\t"						\
> > > > +		".option push\n\t"					\
> > > > +		".option norvc\n\t"					\
> > > > +		"lui	%[__ret],0x89abd\n\t"				\
> > > > +		"lui	%[__tmp],0x1234\n\t"				\
> > > > +		"addiw	%[__ret],%[__ret],-0x211\n\t"			\
> > > > +		"addiw	%[__tmp],%[__tmp],0x567\n\t"			\
> > > > +		ALTERNATIVE_2(						\
> > > > +			"slli	%[__tmp],%[__tmp],32\n\t"		\
> > > > +			"slli	%[__ret],%[__ret],32\n\t"		\
> > > > +			"srli	%[__ret],%[__ret],32\n\t"		\
> > > > +			"add	%[__ret],%[__ret],%[__tmp]\n\t",	\
> > > > +			".option push\n\t"				\
> > > > +			".option arch,+zba\n\t"				\
> > > > +			"slli	%[__tmp],%[__tmp],32\n\t"		\
> > > > +			"add.uw %[__ret],%[__ret],%[__tmp]\n\t"		\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			".option pop\n\t",				\
> > > > +			0, RISCV_ISA_EXT_ZBA, 1,			\
> > > > +			".option push\n\t"				\
> > > > +			".option arch,+zbkb\n\t"			\
> > > > +			"pack	%[__ret],%[__ret],%[__tmp]\n\t"		\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			".option pop\n\t",				\
> > > > +			0, RISCV_ISA_EXT_ZBKB, 1			\
> > > > +		)							\
> > > > +		".option pop\n\t"					\
> > > > +		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"		\
> > > > +		".long 1b - .\n\t"					\
> > > > +		".popsection"						\
> > > > +		: [__ret] "=r" (__ret), [__tmp] "=r" (__tmp));		\
> > > > +	__ret;								\
> > > > +})
> > > > +#endif
> > > > +
> > > > +#ifdef CONFIG_32BIT
> > > > +#define SRLI "srli "
> > > > +#else
> > > > +#define SRLI "srliw "
> > > > +#endif
> > > > +
> > > > +#define runtime_const_shift_right_32(val, sym)			\
> > > > +({								\
> > > > +	u32 __ret;						\
> > > > +	asm_inline("1:\t"					\
> > > > +		".option push\n\t"				\
> > > > +		".option norvc\n\t"				\
> > > > +		SRLI "%[__ret],%[__val],12\n\t"			\
> > > > +		".option pop\n\t"				\
> > > > +		".pushsection runtime_shift_" #sym ",\"a\"\n\t"	\
> > > > +		".long 1b - .\n\t"				\
> > > > +		".popsection"					\
> > > > +		: [__ret] "=r" (__ret)				\
> > > > +		: [__val] "r" (val));				\
> > > > +	__ret;							\
> > > > +})
> > > > +
> > > > +#define runtime_const_init(type, sym) do {			\
> > > > +	extern s32 __start_runtime_##type##_##sym[];		\
> > > > +	extern s32 __stop_runtime_##type##_##sym[];		\
> > > > +								\
> > > > +	runtime_const_fixup(__runtime_fixup_##type,		\
> > > > +			    (unsigned long)(sym),		\
> > > > +			    __start_runtime_##type##_##sym,	\
> > > > +			    __stop_runtime_##type##_##sym);	\
> > > > +} while (0)
> > > > +
> > > > +static inline void __runtime_fixup_caches(void *where, unsigned int insns)
> > > > +{
> > > > +	/* On riscv there are currently only cache-wide flushes so va is ignored. */
> > > > +	__always_unused uintptr_t va = (uintptr_t)where;
> > > > +
> > > > +	flush_icache_range(va, va + 4*insns);
> > > > +}
> > > > +
> > > > +/*
> > > > + * The 32-bit immediate is stored in a lui+addi pairing.
> > > > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
> > > > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
> > > > + */
> > > > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val)
> > > > +{
> > > > +	unsigned int lower_immediate, upper_immediate;
> > > > +	u32 lui_insn = le32_to_cpu(*lui);
> > > > +	u32 addi_insn = le32_to_cpu(*addi);
> > >
> > > Because of the compressed extensions RISC-V instructions are only aligned on
> > > 16bit boundaries, so is there another reason you know that these two
> > > instructions are 32bit aligned? Otherwise you're adding unaligned accesses
> > > here.
> >
> > Great point, thank you. I will add a ".align 4" to the beginning of
> > these instructions to force the alignment.
> 
> Unless there is a specific reason to do the alignment in this case, I'd much
> rather we just do two 16bit reads like riscv/kernel/module.c already does.

I have no strong preference here. That does seem like a better solution
though.

- Charlie

> 
> /Emil

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Charlie Jenkins <charlie@rivosinc.com>
To: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Ben Dooks <ben.dooks@codethink.co.uk>,
	Pasha Bouzarjomehri <pasha@rivosinc.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] riscv: Add runtime constant support
Date: Fri, 31 Jan 2025 13:23:02 -0800	[thread overview]
Message-ID: <Z50_NsMX731u5JBh@ghost> (raw)
In-Reply-To: <CAJM55Z-9-WHY+31=6j-wAiaZdaPv_aH_=P6LEi16twgYH0FL6g@mail.gmail.com>

On Fri, Jan 31, 2025 at 06:46:23AM -0800, Emil Renner Berthing wrote:
> Charlie Jenkins wrote:
> > On Wed, Jan 29, 2025 at 04:30:49AM -0500, Emil Renner Berthing wrote:
> > > Charlie Jenkins wrote:
> > > > Implement the runtime constant infrastructure for riscv. Use this
> > > > infrastructure to generate constants to be used by the d_hash()
> > > > function.
> > > >
> > > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> > > > constant' support") and commit e3c92e81711d ("runtime constants: add
> > > > x86 architecture support").
> > > >
> > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > > > ---
> > > > Ard brought this to my attention in this patch [1].
> > > >
> > > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/
> > > > ---
> > > > Changes in v3:
> > > > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path
> > > >   by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com)
> > > > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com
> > > >
> > > > Changes in v2:
> > > > - Treat instructions as __le32 and do proper conversions (Ben)
> > > > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com
> > > > ---
> > > >  arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++
> > > >  arch/riscv/kernel/vmlinux.lds.S        |   3 +
> > > >  2 files changed, 197 insertions(+)
> > > >
> > > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> > > > new file mode 100644
> > > > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076
> > > > --- /dev/null
> > > > +++ b/arch/riscv/include/asm/runtime-const.h
> > > > @@ -0,0 +1,194 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +#ifndef _ASM_RISCV_RUNTIME_CONST_H
> > > > +#define _ASM_RISCV_RUNTIME_CONST_H
> > > > +
> > > > +#include <asm/alternative.h>
> > > > +#include <asm/cacheflush.h>
> > > > +#include <asm/text-patching.h>
> > > > +#include <linux/uaccess.h>
> > > > +
> > > > +#ifdef CONFIG_32BIT
> > > > +#define runtime_const_ptr(sym)					\
> > > > +({								\
> > > > +	typeof(sym) __ret, __tmp;				\
> > > > +	asm_inline("1:\t"					\
> > > > +		".option push"					\
> > > > +		".option norvc"					\
> > > > +		"lui	%[__ret],0x89abd\n\t"			\
> > > > +		"addi	%[__ret],-0x211\n\t"			\
> > > > +		".option pop"					\
> > > > +		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"	\
> > > > +		".long 1b - .\n\t"				\
> > > > +		".popsection"					\
> > > > +		: [__ret] "=r" (__ret));			\
> > > > +	__ret;							\
> > > > +})
> > > > +#else
> > > > +/*
> > > > + * Loading 64-bit constants into a register from immediates is a non-trivial
> > > > + * task on riscv64. To get it somewhat performant, load 32 bits into two
> > > > + * different registers and then combine the results.
> > > > + *
> > > > + * If the processor supports the Zbkb extension, we can combine the final
> > > > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor
> > > > + * doesn't support Zbkb but does support the Zbb extension, we can
> > > > + * combine the final "slli,srli,add" into one instruction "add.uw".
> > > > + */
> > > > +#define runtime_const_ptr(sym)						\
> > > > +({									\
> > > > +	typeof(sym) __ret, __tmp;					\
> > > > +	asm_inline("1:\t"						\
> > > > +		".option push\n\t"					\
> > > > +		".option norvc\n\t"					\
> > > > +		"lui	%[__ret],0x89abd\n\t"				\
> > > > +		"lui	%[__tmp],0x1234\n\t"				\
> > > > +		"addiw	%[__ret],%[__ret],-0x211\n\t"			\
> > > > +		"addiw	%[__tmp],%[__tmp],0x567\n\t"			\
> > > > +		ALTERNATIVE_2(						\
> > > > +			"slli	%[__tmp],%[__tmp],32\n\t"		\
> > > > +			"slli	%[__ret],%[__ret],32\n\t"		\
> > > > +			"srli	%[__ret],%[__ret],32\n\t"		\
> > > > +			"add	%[__ret],%[__ret],%[__tmp]\n\t",	\
> > > > +			".option push\n\t"				\
> > > > +			".option arch,+zba\n\t"				\
> > > > +			"slli	%[__tmp],%[__tmp],32\n\t"		\
> > > > +			"add.uw %[__ret],%[__ret],%[__tmp]\n\t"		\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			".option pop\n\t",				\
> > > > +			0, RISCV_ISA_EXT_ZBA, 1,			\
> > > > +			".option push\n\t"				\
> > > > +			".option arch,+zbkb\n\t"			\
> > > > +			"pack	%[__ret],%[__ret],%[__tmp]\n\t"		\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			"nop\n\t"					\
> > > > +			".option pop\n\t",				\
> > > > +			0, RISCV_ISA_EXT_ZBKB, 1			\
> > > > +		)							\
> > > > +		".option pop\n\t"					\
> > > > +		".pushsection runtime_ptr_" #sym ",\"a\"\n\t"		\
> > > > +		".long 1b - .\n\t"					\
> > > > +		".popsection"						\
> > > > +		: [__ret] "=r" (__ret), [__tmp] "=r" (__tmp));		\
> > > > +	__ret;								\
> > > > +})
> > > > +#endif
> > > > +
> > > > +#ifdef CONFIG_32BIT
> > > > +#define SRLI "srli "
> > > > +#else
> > > > +#define SRLI "srliw "
> > > > +#endif
> > > > +
> > > > +#define runtime_const_shift_right_32(val, sym)			\
> > > > +({								\
> > > > +	u32 __ret;						\
> > > > +	asm_inline("1:\t"					\
> > > > +		".option push\n\t"				\
> > > > +		".option norvc\n\t"				\
> > > > +		SRLI "%[__ret],%[__val],12\n\t"			\
> > > > +		".option pop\n\t"				\
> > > > +		".pushsection runtime_shift_" #sym ",\"a\"\n\t"	\
> > > > +		".long 1b - .\n\t"				\
> > > > +		".popsection"					\
> > > > +		: [__ret] "=r" (__ret)				\
> > > > +		: [__val] "r" (val));				\
> > > > +	__ret;							\
> > > > +})
> > > > +
> > > > +#define runtime_const_init(type, sym) do {			\
> > > > +	extern s32 __start_runtime_##type##_##sym[];		\
> > > > +	extern s32 __stop_runtime_##type##_##sym[];		\
> > > > +								\
> > > > +	runtime_const_fixup(__runtime_fixup_##type,		\
> > > > +			    (unsigned long)(sym),		\
> > > > +			    __start_runtime_##type##_##sym,	\
> > > > +			    __stop_runtime_##type##_##sym);	\
> > > > +} while (0)
> > > > +
> > > > +static inline void __runtime_fixup_caches(void *where, unsigned int insns)
> > > > +{
> > > > +	/* On riscv there are currently only cache-wide flushes so va is ignored. */
> > > > +	__always_unused uintptr_t va = (uintptr_t)where;
> > > > +
> > > > +	flush_icache_range(va, va + 4*insns);
> > > > +}
> > > > +
> > > > +/*
> > > > + * The 32-bit immediate is stored in a lui+addi pairing.
> > > > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
> > > > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
> > > > + */
> > > > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val)
> > > > +{
> > > > +	unsigned int lower_immediate, upper_immediate;
> > > > +	u32 lui_insn = le32_to_cpu(*lui);
> > > > +	u32 addi_insn = le32_to_cpu(*addi);
> > >
> > > Because of the compressed extensions RISC-V instructions are only aligned on
> > > 16bit boundaries, so is there another reason you know that these two
> > > instructions are 32bit aligned? Otherwise you're adding unaligned accesses
> > > here.
> >
> > Great point, thank you. I will add a ".align 4" to the beginning of
> > these instructions to force the alignment.
> 
> Unless there is a specific reason to do the alignment in this case, I'd much
> rather we just do two 16bit reads like riscv/kernel/module.c already does.

I have no strong preference here. That does seem like a better solution
though.

- Charlie

> 
> /Emil

  reply	other threads:[~2025-01-31 21:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 20:06 [PATCH v3] riscv: Add runtime constant support Charlie Jenkins
2025-01-28 20:06 ` Charlie Jenkins
2025-01-29  9:30 ` Emil Renner Berthing
2025-01-29  9:30   ` Emil Renner Berthing
2025-01-30 23:13   ` Charlie Jenkins
2025-01-30 23:13     ` Charlie Jenkins
2025-01-31 14:46     ` Emil Renner Berthing
2025-01-31 14:46       ` Emil Renner Berthing
2025-01-31 21:23       ` Charlie Jenkins [this message]
2025-01-31 21:23         ` Charlie Jenkins
2025-01-31  0:48 ` kernel test robot
2025-01-31  0:48   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z50_NsMX731u5JBh@ghost \
    --to=charlie@rivosinc.com \
    --cc=ardb@kernel.org \
    --cc=ben.dooks@codethink.co.uk \
    --cc=emil.renner.berthing@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=pasha@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.