From: "Heiko Stübner" <heiko@sntech.de>
To: Andrew Jones <ajones@ventanamicro.com>
Cc: linux-riscv@lists.infradead.org, palmer@dabbelt.com,
christoph.muellner@vrull.eu, conor@kernel.org,
philipp.tomsich@vrull.eu, jszhang@kernel.org
Subject: Re: [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations
Date: Thu, 12 Jan 2023 17:05:19 +0100 [thread overview]
Message-ID: <2094019.QZUTf85G27@diego> (raw)
In-Reply-To: <20230110121320.zr4tk4nl2w57klqg@orel>
Hi Andrew,
thanks a lot for taking the time to provide these extensive and really
helpful comments.
Am Dienstag, 10. Januar 2023, 13:13:20 CET schrieb Andrew Jones:
> On Mon, Jan 09, 2023 at 07:17:54PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> >
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> >
> > The Linux kernel provides standard implementations for string functions
> > but when architectures want to extend them, they need to provide their
> > own.
> >
> > The added generic string functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers and extend them with optimized variants.
> >
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> > arch/riscv/include/asm/string.h | 10 +++++++++
> > arch/riscv/kernel/riscv_ksyms.c | 3 +++
> > arch/riscv/lib/Makefile | 3 +++
> > arch/riscv/lib/strcmp.S | 37 ++++++++++++++++++++++++++++++
> > arch/riscv/lib/strlen.S | 28 +++++++++++++++++++++++
> > arch/riscv/lib/strncmp.S | 40 +++++++++++++++++++++++++++++++++
> > arch/riscv/purgatory/Makefile | 13 +++++++++++
> > 7 files changed, 134 insertions(+)
> > create mode 100644 arch/riscv/lib/strcmp.S
> > create mode 100644 arch/riscv/lib/strlen.S
> > create mode 100644 arch/riscv/lib/strncmp.S
> >
> > diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> > index 909049366555..a96b1fea24fe 100644
> > --- a/arch/riscv/include/asm/string.h
> > +++ b/arch/riscv/include/asm/string.h
> > @@ -18,6 +18,16 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
> > #define __HAVE_ARCH_MEMMOVE
> > extern asmlinkage void *memmove(void *, const void *, size_t);
> > extern asmlinkage void *__memmove(void *, const void *, size_t);
> > +
> > +#define __HAVE_ARCH_STRCMP
> > +extern asmlinkage int strcmp(const char *cs, const char *ct);
> > +
> > +#define __HAVE_ARCH_STRLEN
> > +extern asmlinkage __kernel_size_t strlen(const char *);
> > +
> > +#define __HAVE_ARCH_STRNCMP
> > +extern asmlinkage int strncmp(const char *cs, const char *ct, size_t count);
> > +
> > /* For those files which don't want to check by kasan. */
> > #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> > #define memcpy(dst, src, len) __memcpy(dst, src, len)
> > diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c
> > index 5ab1c7e1a6ed..a72879b4249a 100644
> > --- a/arch/riscv/kernel/riscv_ksyms.c
> > +++ b/arch/riscv/kernel/riscv_ksyms.c
> > @@ -12,6 +12,9 @@
> > EXPORT_SYMBOL(memset);
> > EXPORT_SYMBOL(memcpy);
> > EXPORT_SYMBOL(memmove);
> > +EXPORT_SYMBOL(strcmp);
> > +EXPORT_SYMBOL(strlen);
> > +EXPORT_SYMBOL(strncmp);
> > EXPORT_SYMBOL(__memset);
> > EXPORT_SYMBOL(__memcpy);
> > EXPORT_SYMBOL(__memmove);
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 25d5c9664e57..6c74b0bedd60 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -3,6 +3,9 @@ lib-y += delay.o
> > lib-y += memcpy.o
> > lib-y += memset.o
> > lib-y += memmove.o
> > +lib-y += strcmp.o
> > +lib-y += strlen.o
> > +lib-y += strncmp.o
> > lib-$(CONFIG_MMU) += uaccess.o
> > lib-$(CONFIG_64BIT) += tishift.o
> >
> > diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> > new file mode 100644
> > index 000000000000..94440fb8390c
> > --- /dev/null
> > +++ b/arch/riscv/lib/strcmp.S
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strcmp(const char *cs, const char *ct) */
> > +SYM_FUNC_START(strcmp)
> > + /*
> > + * Returns
> > + * a0 - comparison result, value like strcmp
> > + *
> > + * Parameters
> > + * a0 - string1
> > + * a1 - string2
> > + *
> > + * Clobbers
> > + * t0, t1, t2
> > + */
> > + mv t2, a1
>
> The above instruction and the 'mv a1, t2' below appear to be attempting
> to preserve a1, but that shouldn't be necessary.
correct and gone now
> > +1:
> > + lbu t1, 0(a0)
> > + lbu t0, 0(a1)
>
> I'd rather have t0 be 0(a0) and t1 be 0(a1)
ok
> > + addi a0, a0, 1
> > + addi a1, a1, 1
> > + beq t1, t0, 3f
> > + li a0, 1
> > + bgeu t1, t0, 2f
> > + li a0, -1
> > +2:
> > + mv a1, t2
> > + ret
> > +3:
> > + bnez t1, 1b
> > + li a0, 0
> > + j 2b
>
> For fun I removed one conditional and one unconditional branch (untested)
>
> 1:
> lbu t0, 0(a0)
> lbu t1, 0(a1)
> addi a0, a0, 1
> addi a1, a1, 1
> bne t0, t1, 2f
> bnez t0, 1b
> li a0, 0
> ret
> 2:
> slt a1, t1, t0
> slli a1, a1, 1
> li a0, -1
> add a0, a0, a1
> ret
yep that
- looks correct
- and also seems to produce correct results
also including your
sub a0, t0, t1
comment from the followup that then produces the same result als the
zbb-variant.
And I've also verified the
- 0, if the s1 and s2 are equal;
- a negative value if s1 is less than s2;
- a positive value if s1 is greater than s2.
return value calling convention with documentation
And added a comment above it, pointing out this fact for the next
person stumbling over this :-)
> > +SYM_FUNC_END(strcmp)
> > diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> > new file mode 100644
> > index 000000000000..09a7aaff26c8
> > --- /dev/null
> > +++ b/arch/riscv/lib/strlen.S
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strlen(const char *s) */
> > +SYM_FUNC_START(strlen)
> > + /*
> > + * Returns
> > + * a0 - string length
> > + *
> > + * Parameters
> > + * a0 - String to measure
> > + *
> > + * Clobbers:
> > + * t0, t1
> > + */
> > + mv t1, a0
> > +1:
> > + lbu t0, 0(t1)
> > + bnez t0, 2f
> > + sub a0, t1, a0
> > + ret
> > +2:
> > + addi t1, t1, 1
> > + j 1b
>
> Slightly reorganizing looks better (to me)
>
> mv t1, a0
> 1:
> lbu t0, 0(t1)
> beqz t0, 2f
> addi t1, t1, 1
> j 1b
> 2:
> sub a0, t1, a0
> ret
ok
> > +SYM_FUNC_END(strlen)
> > diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> > new file mode 100644
> > index 000000000000..493ab6febcb2
> > --- /dev/null
> > +++ b/arch/riscv/lib/strncmp.S
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strncmp(const char *cs, const char *ct, size_t count) */
> > +SYM_FUNC_START(strncmp)
> > + /*
> > + * Returns
> > + * a0 - comparison result, value like strncmp
> > + *
> > + * Parameters
> > + * a0 - string1
> > + * a1 - string2
> > + * a2 - number of characters to compare
> > + *
> > + * Clobbers
> > + * t0, t1, t2
> > + */
> > + li t0, 0
> > +1:
> > + beq a2, t0, 4f
> > + add t1, a0, t0
> > + add t2, a1, t0
> > + lbu t1, 0(t1)
> > + lbu t2, 0(t2)
> > + beq t1, t2, 3f
> > + li a0, 1
> > + bgeu t1, t2, 2f
> > + li a0, -1
> > +2:
> > + ret
> > +3:
> > + addi t0, t0, 1
> > + bnez t1, 1b
> > +4:
> > + li a0, 0
> > + j 2b
>
> (untested)
>
> li t2, 0
> 1:
> beq a2, t2, 2f
> lbu t0, 0(a0)
> lbu t1, 0(a1)
> addi a0, a0, 1
> addi a1, a1, 1
> bne t0, t1, 3f
> addi t2, t2, 1
> bnez t0, 1b
> 2:
> li a0, 0
> ret
> 3:
> slt a1, t1, t0
> slli a1, a1, 1
> li a0, -1
> add a0, a0, a1
> ret
same here, I did go over the changed assembly and verified that
it produces the same results as the original and then did a second
pass to also add the
sub a0, t0, t1
replacment.
> > +SYM_FUNC_END(strncmp)
> > diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
> > index dd58e1d99397..d16bf715a586 100644
> > --- a/arch/riscv/purgatory/Makefile
> > +++ b/arch/riscv/purgatory/Makefile
> > @@ -2,6 +2,7 @@
> > OBJECT_FILES_NON_STANDARD := y
> >
> > purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o
> > +purgatory-y += strcmp.o strlen.o strncmp.o
> >
> > targets += $(purgatory-y)
> > PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
> > @@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE
> > $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE
> > $(call if_changed_rule,as_o_S)
> >
> > +$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
> > $(call if_changed_rule,cc_o_c)
> >
> > @@ -77,6 +87,9 @@ CFLAGS_ctype.o += $(PURGATORY_CFLAGS)
> > AFLAGS_REMOVE_entry.o += -Wa,-gdwarf-2
> > AFLAGS_REMOVE_memcpy.o += -Wa,-gdwarf-2
> > AFLAGS_REMOVE_memset.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strcmp.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strlen.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strncmp.o += -Wa,-gdwarf-2
> >
> > $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> > $(call if_changed,ld)
> >
>
> With at least the removal of the unnecessary preserving of a1 in strcmp,
> then it looks correct to me, so
>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
>
> But I think there's room for making it more readable, and maybe even
> optimized, as I've tried to do.
Thanks a lot
Heiko
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2023-01-12 16:05 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-09 18:17 [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Heiko Stuebner
2023-01-09 18:17 ` [PATCH v4 1/5] RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes Heiko Stuebner
2023-01-09 20:53 ` Conor Dooley
2023-01-11 15:14 ` Heiko Stübner
2023-01-10 8:32 ` Andrew Jones
2023-01-09 18:17 ` [PATCH v4 2/5] RISC-V: add helpers for J-type immediate handling Heiko Stuebner
2023-01-09 22:22 ` Conor Dooley
2023-01-10 8:44 ` Andrew Jones
2023-01-10 8:54 ` Conor Dooley
2023-01-11 14:43 ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 3/5] RISC-V: fix jal addresses in patched alternatives Heiko Stuebner
2023-01-10 9:28 ` Andrew Jones
2023-01-11 17:15 ` Jisheng Zhang
2023-01-11 13:18 ` Jisheng Zhang
2023-01-11 13:53 ` Heiko Stübner
2023-01-11 14:15 ` Andrew Jones
2023-01-11 14:44 ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations Heiko Stuebner
2023-01-09 22:37 ` Conor Dooley
2023-01-09 23:31 ` Heiko Stübner
2023-01-10 9:39 ` Andrew Jones
2023-01-10 10:46 ` Heiko Stübner
2023-01-10 11:16 ` Andrew Jones
2023-01-11 12:34 ` Andrew Jones
[not found] ` <CAEg0e7gJgpoiGjfLeedba0-r=dCE1Z_qkU53w_+-cVjsuqaC3A@mail.gmail.com>
2023-01-11 13:42 ` Philipp Tomsich
2023-01-11 13:47 ` Andrew Jones
2023-01-10 12:13 ` Andrew Jones
2023-01-11 12:30 ` Andrew Jones
2023-01-12 16:05 ` Heiko Stübner [this message]
2023-01-09 18:17 ` [PATCH v4 5/5] RISC-V: add zbb support to string functions Heiko Stuebner
2023-01-09 20:39 ` Conor Dooley
2023-01-10 9:57 ` Andrew Jones
2023-01-10 10:14 ` Conor Dooley
2023-01-12 11:21 ` Heiko Stübner
2023-01-12 12:06 ` Conor Dooley
2023-01-12 12:28 ` Heiko Stübner
2023-01-11 12:24 ` Andrew Jones
2023-01-11 14:27 ` Christoph Müllner
2023-01-11 15:16 ` Andrew Jones
2023-01-11 15:22 ` Jeff Law
2023-01-12 22:05 ` Heiko Stübner
2023-01-11 13:24 ` [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Jisheng Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2094019.QZUTf85G27@diego \
--to=heiko@sntech.de \
--cc=ajones@ventanamicro.com \
--cc=christoph.muellner@vrull.eu \
--cc=conor@kernel.org \
--cc=jszhang@kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=philipp.tomsich@vrull.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.