All of lore.kernel.org
 help / color / mirror / Atom feed
From: Charlie Jenkins <charlie@rivosinc.com>
To: Andy Chiu <andy.chiu@sifive.com>
Cc: linux-riscv@lists.infradead.org, palmer@dabbelt.com,
	paul.walmsley@sifive.com, greentime.hu@sifive.com,
	guoren@linux.alibaba.com, bjorn@kernel.org, ardb@kernel.org,
	arnd@arndb.de, peterz@infradead.org, tglx@linutronix.de,
	ebiggers@kernel.org, Albert Ou <aou@eecs.berkeley.edu>,
	Kees Cook <keescook@chromium.org>,
	Han-Kuan Chen <hankuan.chen@sifive.com>,
	Conor Dooley <conor.dooley@microchip.com>,
	Andrew Jones <ajones@ventanamicro.com>,
	Heiko Stuebner <heiko@sntech.de>
Subject: Re: [v8, 06/10] riscv: lib: add vectorized mem* routines
Date: Tue, 26 Dec 2023 17:42:25 -0800	[thread overview]
Message-ID: <ZYuBAVwjOZ0A8j3J@ghost> (raw)
In-Reply-To: <20231223042914.18599-7-andy.chiu@sifive.com>

On Sat, Dec 23, 2023 at 04:29:10AM +0000, Andy Chiu wrote:
> Provide vectorized memcpy/memset/memmove to accelerate common memory
> operations. Also, group them into V_OPT_TEMPLATE3 macro because their
> setup/tear-down and fallback logics are the same.
> 
> The optimal size for the kernel to preference Vector over scalar,
> riscv_v_mem*_threshold, is only a heuristic for now. We can add DT
> parsing if people feel the need of customizing it.
> 
> The original implementation of Vector operations comes from
> https://github.com/sifive/sifive-libc, which we agree to contribute to
> Linux kernel.
> 
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> ---
> Changelog v7:
>  - add __NO_FORTIFY to prevent conflicting function declaration with
>    macro for mem* functions.
> Changelog v6:
>  - provide kconfig to set threshold for vectorized functions (Charlie)
>  - rename *thres to *threshold (Charlie)
> Changelog v4:
>  - new patch since v4
> ---
>  arch/riscv/Kconfig               | 24 ++++++++++++++++
>  arch/riscv/lib/Makefile          |  3 ++
>  arch/riscv/lib/memcpy_vector.S   | 29 +++++++++++++++++++
>  arch/riscv/lib/memmove_vector.S  | 49 ++++++++++++++++++++++++++++++++
>  arch/riscv/lib/memset_vector.S   | 33 +++++++++++++++++++++
>  arch/riscv/lib/riscv_v_helpers.c | 26 +++++++++++++++++
>  6 files changed, 164 insertions(+)
>  create mode 100644 arch/riscv/lib/memcpy_vector.S
>  create mode 100644 arch/riscv/lib/memmove_vector.S
>  create mode 100644 arch/riscv/lib/memset_vector.S
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 3c5ba05e8a2d..cba53dcc2ae0 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -533,6 +533,30 @@ config RISCV_ISA_V_UCOPY_THRESHOLD
>  	  Prefer using vectorized copy_to_user()/copy_from_user() when the
>  	  workload size exceeds this value.
>  
> +config RISCV_ISA_V_MEMSET_THRESHOLD
> +	int "Threshold size for vectorized memset()"
> +	depends on RISCV_ISA_V
> +	default 1280
> +	help
> +	  Prefer using vectorized memset() when the workload size exceeds this
> +	  value.
> +
> +config RISCV_ISA_V_MEMCPY_THRESHOLD
> +	int "Threshold size for vectorized memcpy()"
> +	depends on RISCV_ISA_V
> +	default 768
> +	help
> +	  Prefer using vectorized memcpy() when the workload size exceeds this
> +	  value.
> +
> +config RISCV_ISA_V_MEMMOVE_THRESHOLD
> +	int "Threshold size for vectorized memmove()"
> +	depends on RISCV_ISA_V
> +	default 512
> +	help
> +	  Prefer using vectorized memmove() when the workload size exceeds this
> +	  value.
> +
>  config TOOLCHAIN_HAS_ZBB
>  	bool
>  	default y
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index c8a6787d5827..d389dbf285fe 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -16,3 +16,6 @@ lib-$(CONFIG_RISCV_ISA_ZICBOZ)	+= clear_page.o
>  obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
>  lib-$(CONFIG_RISCV_ISA_V)	+= xor.o
>  lib-$(CONFIG_RISCV_ISA_V)	+= riscv_v_helpers.o
> +lib-$(CONFIG_RISCV_ISA_V)	+= memset_vector.o
> +lib-$(CONFIG_RISCV_ISA_V)	+= memcpy_vector.o
> +lib-$(CONFIG_RISCV_ISA_V)	+= memmove_vector.o
> diff --git a/arch/riscv/lib/memcpy_vector.S b/arch/riscv/lib/memcpy_vector.S
> new file mode 100644
> index 000000000000..4176b6e0a53c
> --- /dev/null
> +++ b/arch/riscv/lib/memcpy_vector.S
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +
> +#define pDst a0
> +#define pSrc a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define pDstPtr a4
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +
> +/* void *memcpy(void *, const void *, size_t) */
> +SYM_FUNC_START(__asm_memcpy_vector)
> +	mv pDstPtr, pDst
> +loop:
> +	vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +	vle8.v vData, (pSrc)
> +	sub iNum, iNum, iVL
> +	add pSrc, pSrc, iVL
> +	vse8.v vData, (pDstPtr)
> +	add pDstPtr, pDstPtr, iVL
> +	bnez iNum, loop
> +	ret
> +SYM_FUNC_END(__asm_memcpy_vector)
> diff --git a/arch/riscv/lib/memmove_vector.S b/arch/riscv/lib/memmove_vector.S
> new file mode 100644
> index 000000000000..4cea9d244dc9
> --- /dev/null
> +++ b/arch/riscv/lib/memmove_vector.S
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +
> +#define pDst a0
> +#define pSrc a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define pDstPtr a4
> +#define pSrcBackwardPtr a5
> +#define pDstBackwardPtr a6
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +SYM_FUNC_START(__asm_memmove_vector)
> +
> +    mv pDstPtr, pDst
> +
> +    bgeu pSrc, pDst, forward_copy_loop
> +    add pSrcBackwardPtr, pSrc, iNum
> +    add pDstBackwardPtr, pDst, iNum
> +    bltu pDst, pSrcBackwardPtr, backward_copy_loop
> +
> +forward_copy_loop:
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    vle8.v vData, (pSrc)
> +    sub iNum, iNum, iVL
> +    add pSrc, pSrc, iVL
> +    vse8.v vData, (pDstPtr)
> +    add pDstPtr, pDstPtr, iVL
> +
> +    bnez iNum, forward_copy_loop
> +    ret
> +
> +backward_copy_loop:
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    sub pSrcBackwardPtr, pSrcBackwardPtr, iVL
> +    vle8.v vData, (pSrcBackwardPtr)
> +    sub iNum, iNum, iVL
> +    sub pDstBackwardPtr, pDstBackwardPtr, iVL
> +    vse8.v vData, (pDstBackwardPtr)
> +    bnez iNum, backward_copy_loop
> +    ret
> +
> +SYM_FUNC_END(__asm_memmove_vector)
> diff --git a/arch/riscv/lib/memset_vector.S b/arch/riscv/lib/memset_vector.S
> new file mode 100644
> index 000000000000..4611feed72ac
> --- /dev/null
> +++ b/arch/riscv/lib/memset_vector.S
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +
> +#define pDst a0
> +#define iValue a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define iTemp a4
> +#define pDstPtr a5
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +/* void *memset(void *, int, size_t) */
> +SYM_FUNC_START(__asm_memset_vector)
> +
> +    mv pDstPtr, pDst
> +
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +    vmv.v.x vData, iValue
> +
> +loop:
> +    vse8.v vData, (pDstPtr)
> +    sub iNum, iNum, iVL
> +    add pDstPtr, pDstPtr, iVL
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +    bnez iNum, loop
> +
> +    ret
> +
> +SYM_FUNC_END(__asm_memset_vector)
> diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c
> index 6cac8f4e69e9..c62f333ba557 100644
> --- a/arch/riscv/lib/riscv_v_helpers.c
> +++ b/arch/riscv/lib/riscv_v_helpers.c
> @@ -3,9 +3,13 @@
>   * Copyright (C) 2023 SiFive
>   * Author: Andy Chiu <andy.chiu@sifive.com>
>   */
> +#ifndef __NO_FORTIFY
> +# define __NO_FORTIFY
> +#endif
>  #include <linux/linkage.h>
>  #include <asm/asm.h>
>  
> +#include <asm/string.h>
>  #include <asm/vector.h>
>  #include <asm/simd.h>
>  
> @@ -42,3 +46,25 @@ asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n)
>  	return fallback_scalar_usercopy(dst, src, n);
>  }
>  #endif
> +
> +#define V_OPT_TEMPLATE3(prefix, type_r, type_0, type_1)				\
> +extern type_r __asm_##prefix##_vector(type_0, type_1, size_t n);		\
> +type_r prefix(type_0 a0, type_1 a1, size_t n)					\
> +{										\
> +	type_r ret;								\
> +	if (has_vector() && may_use_simd() &&					\
> +	    n > riscv_v_##prefix##_threshold) {					\
> +		kernel_vector_begin();						\
> +		ret = __asm_##prefix##_vector(a0, a1, n);			\
> +		kernel_vector_end();						\
> +		return ret;							\
> +	}									\
> +	return __##prefix(a0, a1, n);						\
> +}
> +
> +static size_t riscv_v_memset_threshold = CONFIG_RISCV_ISA_V_MEMSET_THRESHOLD;
> +V_OPT_TEMPLATE3(memset, void *, void*, int)
> +static size_t riscv_v_memcpy_threshold = CONFIG_RISCV_ISA_V_MEMCPY_THRESHOLD;
> +V_OPT_TEMPLATE3(memcpy, void *, void*, const void *)
> +static size_t riscv_v_memmove_threshold = CONFIG_RISCV_ISA_V_MEMMOVE_THRESHOLD;
> +V_OPT_TEMPLATE3(memmove, void *, void*, const void *)
> -- 
> 2.17.1
> 

Thank you for adding the kconfigs for the thresholds.

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-12-27  1:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-23  4:29 [v8, 00/10] riscv: support kernel-mode Vector Andy Chiu
2023-12-23  4:29 ` [v8, 01/10] riscv: Add support for kernel mode vector Andy Chiu
2023-12-27  1:36   ` Charlie Jenkins
2023-12-27  2:46     ` Andy Chiu
2023-12-27  5:30       ` Charlie Jenkins
2023-12-27  9:18         ` Andy Chiu
2023-12-28  1:52           ` Charlie Jenkins
2023-12-23  4:29 ` [v8, 02/10] riscv: vector: make Vector always available for softirq context Andy Chiu
2023-12-23  4:29 ` [v8, 03/10] riscv: Add vector extension XOR implementation Andy Chiu
2023-12-23  4:29 ` [v8, 04/10] riscv: sched: defer restoring Vector context for user Andy Chiu
2023-12-27 12:07   ` Song Shuai
2023-12-23  4:29 ` [v8, 05/10] riscv: lib: vectorize copy_to_user/copy_from_user Andy Chiu
2023-12-27  1:27   ` Charlie Jenkins
2023-12-27  1:34   ` Guo Ren
2023-12-27  3:15     ` Andy Chiu
2024-01-15  5:42       ` Andy Chiu
2023-12-23  4:29 ` [v8, 06/10] riscv: lib: add vectorized mem* routines Andy Chiu
2023-12-27  1:42   ` Charlie Jenkins [this message]
2023-12-23  4:29 ` [v8, 07/10] riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() Andy Chiu
2023-12-23  4:29 ` [v8, 08/10] riscv: vector: use a mask to write vstate_ctrl Andy Chiu
2023-12-23  4:29 ` [v8, 09/10] riscv: vector: use kmem_cache to manage vector context Andy Chiu
2023-12-23  4:29 ` [v8, 10/10] riscv: vector: allow kernel-mode Vector with preemption Andy Chiu
2023-12-27 12:12   ` Song Shuai
2023-12-27 22:45   ` Samuel Holland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYuBAVwjOZ0A8j3J@ghost \
    --to=charlie@rivosinc.com \
    --cc=ajones@ventanamicro.com \
    --cc=andy.chiu@sifive.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bjorn@kernel.org \
    --cc=conor.dooley@microchip.com \
    --cc=ebiggers@kernel.org \
    --cc=greentime.hu@sifive.com \
    --cc=guoren@linux.alibaba.com \
    --cc=hankuan.chen@sifive.com \
    --cc=heiko@sntech.de \
    --cc=keescook@chromium.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.