All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Heiko Stübner" <heiko@sntech.de>
To: Andrew Jones <ajones@ventanamicro.com>
Cc: linux-riscv@lists.infradead.org, palmer@dabbelt.com,
	christoph.muellner@vrull.eu, conor@kernel.org,
	philipp.tomsich@vrull.eu, jszhang@kernel.org
Subject: Re: [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations
Date: Thu, 12 Jan 2023 17:05:19 +0100	[thread overview]
Message-ID: <2094019.QZUTf85G27@diego> (raw)
In-Reply-To: <20230110121320.zr4tk4nl2w57klqg@orel>

Hi Andrew,

thanks a lot for taking the time to provide these extensive and really
helpful comments.

Am Dienstag, 10. Januar 2023, 13:13:20 CET schrieb Andrew Jones:
> On Mon, Jan 09, 2023 at 07:17:54PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> > 
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> > 
> > The Linux kernel provides standard implementations for string functions
> > but when architectures want to extend them, they need to provide their
> > own.
> > 
> > The added generic string functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers and extend them with optimized variants.
> > 
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/string.h | 10 +++++++++
> >  arch/riscv/kernel/riscv_ksyms.c |  3 +++
> >  arch/riscv/lib/Makefile         |  3 +++
> >  arch/riscv/lib/strcmp.S         | 37 ++++++++++++++++++++++++++++++
> >  arch/riscv/lib/strlen.S         | 28 +++++++++++++++++++++++
> >  arch/riscv/lib/strncmp.S        | 40 +++++++++++++++++++++++++++++++++
> >  arch/riscv/purgatory/Makefile   | 13 +++++++++++
> >  7 files changed, 134 insertions(+)
> >  create mode 100644 arch/riscv/lib/strcmp.S
> >  create mode 100644 arch/riscv/lib/strlen.S
> >  create mode 100644 arch/riscv/lib/strncmp.S
> > 
> > diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> > index 909049366555..a96b1fea24fe 100644
> > --- a/arch/riscv/include/asm/string.h
> > +++ b/arch/riscv/include/asm/string.h
> > @@ -18,6 +18,16 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
> >  #define __HAVE_ARCH_MEMMOVE
> >  extern asmlinkage void *memmove(void *, const void *, size_t);
> >  extern asmlinkage void *__memmove(void *, const void *, size_t);
> > +
> > +#define __HAVE_ARCH_STRCMP
> > +extern asmlinkage int strcmp(const char *cs, const char *ct);
> > +
> > +#define __HAVE_ARCH_STRLEN
> > +extern asmlinkage __kernel_size_t strlen(const char *);
> > +
> > +#define __HAVE_ARCH_STRNCMP
> > +extern asmlinkage int strncmp(const char *cs, const char *ct, size_t count);
> > +
> >  /* For those files which don't want to check by kasan. */
> >  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> >  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> > diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c
> > index 5ab1c7e1a6ed..a72879b4249a 100644
> > --- a/arch/riscv/kernel/riscv_ksyms.c
> > +++ b/arch/riscv/kernel/riscv_ksyms.c
> > @@ -12,6 +12,9 @@
> >  EXPORT_SYMBOL(memset);
> >  EXPORT_SYMBOL(memcpy);
> >  EXPORT_SYMBOL(memmove);
> > +EXPORT_SYMBOL(strcmp);
> > +EXPORT_SYMBOL(strlen);
> > +EXPORT_SYMBOL(strncmp);
> >  EXPORT_SYMBOL(__memset);
> >  EXPORT_SYMBOL(__memcpy);
> >  EXPORT_SYMBOL(__memmove);
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 25d5c9664e57..6c74b0bedd60 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -3,6 +3,9 @@ lib-y			+= delay.o
> >  lib-y			+= memcpy.o
> >  lib-y			+= memset.o
> >  lib-y			+= memmove.o
> > +lib-y			+= strcmp.o
> > +lib-y			+= strlen.o
> > +lib-y			+= strncmp.o
> >  lib-$(CONFIG_MMU)	+= uaccess.o
> >  lib-$(CONFIG_64BIT)	+= tishift.o
> >  
> > diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> > new file mode 100644
> > index 000000000000..94440fb8390c
> > --- /dev/null
> > +++ b/arch/riscv/lib/strcmp.S
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strcmp(const char *cs, const char *ct) */
> > +SYM_FUNC_START(strcmp)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strcmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	mv	t2, a1
> 
> The above instruction and the 'mv a1, t2' below appear to be attempting
> to preserve a1, but that shouldn't be necessary.

correct and gone now

> > +1:
> > +	lbu	t1, 0(a0)
> > +	lbu	t0, 0(a1)
> 
> I'd rather have t0 be 0(a0) and t1 be 0(a1)

ok

> > +	addi	a0, a0, 1
> > +	addi	a1, a1, 1
> > +	beq	t1, t0, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t0, 2f
> > +	li	a0, -1
> > +2:
> > +	mv	a1, t2
> > +	ret
> > +3:
> > +	bnez	t1, 1b
> > +	li	a0, 0
> > +	j	2b
> 
> For fun I removed one conditional and one unconditional branch (untested)
> 
> 1:
>      lbu     t0, 0(a0)
>      lbu     t1, 0(a1)
>      addi    a0, a0, 1
>      addi    a1, a1, 1
>      bne     t0, t1, 2f
>      bnez    t0, 1b
>      li      a0, 0
>      ret
> 2:
>      slt     a1, t1, t0
>      slli    a1, a1, 1
>      li      a0, -1
>      add     a0, a0, a1
>      ret

yep that
- looks correct
- and also seems to produce correct results

also including your
	  sub     a0, t0, t1
comment from the followup that then produces the same result als the
zbb-variant.

And I've also verified the
- 0, if the s1 and s2 are equal;
- a negative value if s1 is less than s2;
- a positive value if s1 is greater than s2.
return value calling convention with documentation

And added a comment above it, pointing out this fact for the next
person stumbling over this :-)


> > +SYM_FUNC_END(strcmp)
> > diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> > new file mode 100644
> > index 000000000000..09a7aaff26c8
> > --- /dev/null
> > +++ b/arch/riscv/lib/strlen.S
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strlen(const char *s) */
> > +SYM_FUNC_START(strlen)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - string length
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - String to measure
> > +	 *
> > +	 * Clobbers:
> > +	 *   t0, t1
> > +	 */
> > +	mv	t1, a0
> > +1:
> > +	lbu	t0, 0(t1)
> > +	bnez	t0, 2f
> > +	sub	a0, t1, a0
> > +	ret
> > +2:
> > +	addi	t1, t1, 1
> > +	j	1b
> 
> Slightly reorganizing looks better (to me)
> 
>    mv    t1, a0
> 1:
>    lbu   t0, 0(t1)
>    beqz  t0, 2f
>    addi  t1, t1, 1
>    j     1b
> 2:
>    sub a0, t1, a0
>    ret

ok


> > +SYM_FUNC_END(strlen)
> > diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> > new file mode 100644
> > index 000000000000..493ab6febcb2
> > --- /dev/null
> > +++ b/arch/riscv/lib/strncmp.S
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strncmp(const char *cs, const char *ct, size_t count) */
> > +SYM_FUNC_START(strncmp)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strncmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *   a2 - number of characters to compare
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	li	t0, 0
> > +1:
> > +	beq	a2, t0, 4f
> > +	add	t1, a0, t0
> > +	add	t2, a1, t0
> > +	lbu	t1, 0(t1)
> > +	lbu	t2, 0(t2)
> > +	beq	t1, t2, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t2, 2f
> > +	li	a0, -1
> > +2:
> > +	ret
> > +3:
> > +	addi	t0, t0, 1
> > +	bnez	t1, 1b
> > +4:
> > +	li	a0, 0
> > +	j	2b
> 
> (untested)
> 
>      li      t2, 0
> 1:
>      beq     a2, t2, 2f
>      lbu     t0, 0(a0)
>      lbu     t1, 0(a1)
>      addi    a0, a0, 1
>      addi    a1, a1, 1
>      bne     t0, t1, 3f
>      addi    t2, t2, 1
>      bnez    t0, 1b
> 2:
>      li      a0, 0
>      ret
> 3:
>      slt     a1, t1, t0
>      slli    a1, a1, 1
>      li      a0, -1
>      add     a0, a0, a1
>      ret

same here, I did go over the changed assembly and verified that
it produces the same results as the original and then did a second
pass to also add the  
	sub     a0, t0, t1
replacment.


> > +SYM_FUNC_END(strncmp)
> > diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
> > index dd58e1d99397..d16bf715a586 100644
> > --- a/arch/riscv/purgatory/Makefile
> > +++ b/arch/riscv/purgatory/Makefile
> > @@ -2,6 +2,7 @@
> >  OBJECT_FILES_NON_STANDARD := y
> >  
> >  purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o
> > +purgatory-y += strcmp.o strlen.o strncmp.o
> >  
> >  targets += $(purgatory-y)
> >  PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
> > @@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE
> >  $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE
> >  	$(call if_changed_rule,as_o_S)
> >  
> > +$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE
> > +	$(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE
> > +	$(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE
> > +	$(call if_changed_rule,as_o_S)
> > +
> >  $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
> >  	$(call if_changed_rule,cc_o_c)
> >  
> > @@ -77,6 +87,9 @@ CFLAGS_ctype.o			+= $(PURGATORY_CFLAGS)
> >  AFLAGS_REMOVE_entry.o		+= -Wa,-gdwarf-2
> >  AFLAGS_REMOVE_memcpy.o		+= -Wa,-gdwarf-2
> >  AFLAGS_REMOVE_memset.o		+= -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strcmp.o		+= -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strlen.o		+= -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strncmp.o		+= -Wa,-gdwarf-2
> >  
> >  $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> >  		$(call if_changed,ld)
> >
> 
> With at least the removal of the unnecessary preserving of a1 in strcmp,
> then it looks correct to me, so
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 
> But I think there's room for making it more readable, and maybe even
> optimized, as I've tried to do.


Thanks a lot
Heiko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2023-01-12 16:05 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-09 18:17 [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Heiko Stuebner
2023-01-09 18:17 ` [PATCH v4 1/5] RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes Heiko Stuebner
2023-01-09 20:53   ` Conor Dooley
2023-01-11 15:14     ` Heiko Stübner
2023-01-10  8:32   ` Andrew Jones
2023-01-09 18:17 ` [PATCH v4 2/5] RISC-V: add helpers for J-type immediate handling Heiko Stuebner
2023-01-09 22:22   ` Conor Dooley
2023-01-10  8:44   ` Andrew Jones
2023-01-10  8:54     ` Conor Dooley
2023-01-11 14:43       ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 3/5] RISC-V: fix jal addresses in patched alternatives Heiko Stuebner
2023-01-10  9:28   ` Andrew Jones
2023-01-11 17:15     ` Jisheng Zhang
2023-01-11 13:18   ` Jisheng Zhang
2023-01-11 13:53     ` Heiko Stübner
2023-01-11 14:15     ` Andrew Jones
2023-01-11 14:44       ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations Heiko Stuebner
2023-01-09 22:37   ` Conor Dooley
2023-01-09 23:31     ` Heiko Stübner
2023-01-10  9:39   ` Andrew Jones
2023-01-10 10:46     ` Heiko Stübner
2023-01-10 11:16       ` Andrew Jones
2023-01-11 12:34         ` Andrew Jones
     [not found]           ` <CAEg0e7gJgpoiGjfLeedba0-r=dCE1Z_qkU53w_+-cVjsuqaC3A@mail.gmail.com>
2023-01-11 13:42             ` Philipp Tomsich
2023-01-11 13:47             ` Andrew Jones
2023-01-10 12:13   ` Andrew Jones
2023-01-11 12:30     ` Andrew Jones
2023-01-12 16:05     ` Heiko Stübner [this message]
2023-01-09 18:17 ` [PATCH v4 5/5] RISC-V: add zbb support to string functions Heiko Stuebner
2023-01-09 20:39   ` Conor Dooley
2023-01-10  9:57   ` Andrew Jones
2023-01-10 10:14     ` Conor Dooley
2023-01-12 11:21       ` Heiko Stübner
2023-01-12 12:06         ` Conor Dooley
2023-01-12 12:28           ` Heiko Stübner
2023-01-11 12:24   ` Andrew Jones
2023-01-11 14:27     ` Christoph Müllner
2023-01-11 15:16       ` Andrew Jones
2023-01-11 15:22       ` Jeff Law
2023-01-12 22:05     ` Heiko Stübner
2023-01-11 13:24 ` [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Jisheng Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2094019.QZUTf85G27@diego \
    --to=heiko@sntech.de \
    --cc=ajones@ventanamicro.com \
    --cc=christoph.muellner@vrull.eu \
    --cc=conor@kernel.org \
    --cc=jszhang@kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=philipp.tomsich@vrull.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.